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Preface 



This volume consists of papers selected from the contributions presented at the 
International Workshop on Applications of Graph Transformation with Indus- 
trial Relevance (ACTIVE ’99). The papers underwent up to two additional re- 
views. This volume contains the revised versions of these papers. 

The workshop took place at Rolduc Monastery, Kerkrade, The Netherlands, 
nearby Aachen, Germany, September 1-3, 1999. The workshop was an official 
event of the Esprit Working Group APP LI GRAPH (Applications of Graph 
Transformations), funded by the European Community. One day before the 
workshop there was a subarea meeting of APPLIGRAPH on tools, the talks 
of which are not included in this volume. 

The graph transformation community is a medium sized group of researchers 
spread all over the world. The community is quite active (see the list of books at 
the end of this volume). Many events of the community concentrate on theoret- 
ical aspects or cover the whole spectrum from theory to applications. However, 
the ACTIVE ’99 Workshop implemented a new idea. 

Eor a couple of years now there have been a number of tools facilitating the 
development of specifications of graph transformation systems and the imple- 
mentation of applications based on graph transformation. Eurthermore, in the 
past years there has been a shift of focus in graph transformation from theory 
to applications within computer science and other engineering disciplines. 

As these applications of graph transformation have reached a certain matu- 
rity, the idea was born three years ago to organize a workshop of presentations 
which could have an influence on industry or which had already been developed 
in cooperation projects with partners from industry. This should give a signal 
to industry that some benefit may be gained from graph transformation and the 
results and systems available. 

The program committee of the International ACTIVE Workshop consisted 
of the following persons: 

D. Blostein 
H. Bunke 
H. Ehrig 

G. Engels 
S. Kaplan 
U. Montanari 
M. Nagl 

E. Parisi-Presicce 
M. Pezze 
R. Plasmeijer 

H. -J. Schneider 
A. Schiirr 



Queen’s University, Kingston, Canada 

University of Bern, Switzerland 

Techn. University of Berlin, Germany 

University of Paderborn, Germany 

University of Queensland, Australia 

University of Pisa, Italy 

Aachen University of Technology, Germany 

University of Rome, Italy 

University of Milan, Italy 

University of Nijmegen, The Netherlands 

University of Erlangen, Germany 

University of the German Armed 

Eorces Munich, Germany 
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Preface 



The conference site Rolduc Monastery is an old abbey founded at the begin- 
ning of the 12th century. The building was first used as a monastery and later as 
an education center for priests for many centuries. In the 18th century the abbey 
gained a considerable income by exploiting a coal mine. Nowadays, Rolduc is an 
education and recreation center. 

The friendly and familiar atmosphere of the workshop could be established 
mainly due to the attractive workshop location at Rolduc Monastery. We would 
like to thank the personnel of this education center who cared for us so well. 

Giving demos of running systems was a part of the workshop. Accordingly, 
short demo descriptions are included in this volume. The aim is to facilitate 
contacts of between industry and research groups offering certain systems. Fur- 
thermore, in the following we give a list of systems with email and web addresses 
in order to stimulate mutual contacts. 

There was a conference dinner given at Vaalsbroek Castle at Vaals, The 
Netherlands, which all participants enjoyed very much. At the dinner there was 
a ceremony looking back 30 years to the origin of graph grammars. The inventors 
of the graph grammar discipline, namely John Pfaltz, Charlottesville, USA and 
Hans-Jiirgen Schneider, Erlangen, Germany, were honored. 

A panel discussion on “Industrial Relevance of Graph Transformation — The 
Reality and our Dreams” took place during the workshop. A summary of this 
panel discussion is included in this volume. 

In order to enliven the workshop there were three competitions: (a) for the 
best long paper, (b) for the best short paper, and (c) for the best demo presen- 
tation. There is a short report on these contests printed in this volume. 

The workshop was attended by 50 participants from 11 countries (Austria, 
Belgium, Brazil, Canada, Finland, Germany, Great Britain, Italy, The Nether- 
lands, Poland, USA). The success of the workshop is based on the activeness of 
participants contributing to the presentations and discussions, and on the work 
done by referees and, especially, by the members of the program committee. 

The workshop was made possible by grants given by the following organiza- 
tions: Deutsche Forschungsgemeinschaft (German Research Foundation), Freunde 
der Aachener Hochschule (Aachen Alumni Organization), the Rector of Aachen 
University of Technology and the European Community (APPLIGRAPH). All 
these donations are gratefully acknowledged. In particular, they have allowed 
researchers from abroad as well as young researchers to come to Aachen by par- 
tially financing their travel expenses. Furthermore, the grants covered part of 
the organizational costs of the workshop. 

Last but not least the editors would like to thank the members of the orga- 
nization committee consisting of A. Behle, K. Cremer, A. Fleck, F. Gatzemeyer, 
D. Jager, M. Jiirss-Nysten, O. Meyer, A. Schleicher, B. West fee htel, and some 
Master’s students of Computer Science from Aachen University of Technology. 
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Manfred Nagl 
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Manfred Miinch 




Available Systems for Graph Transformation or 
Systems Built on Graph Transformation 



Graph transformation tools: 

— AGG: visual tool environment consisting of editors, interpreter, and de- 
bugger for attributed graph transformation; attribute computation by Java; 
supports a hybrid programming style based on graph transformation and 
Java; Techn. University of Berlin, Germany. 

URL: http://tfs.cs.tu-berlin.de/agg 
email: stefan@cs.tu-berlin.de 

— BIZZAR2: three-dimensional film generation tool; University of Bremen, 
Germany. 

— Collage System: collage graph grammar based picture and film generation 
tool; University of Bremen, Germany. 

URL: http://www.informatik.uni-bremen.de/^ns/cs/Main.html 
email : ns@ informat ik . uni-bremen . de 

— DiaGen: hypergraph-grammar-based diagram editor generator; University 
of Erlangen, Germany. 

URL: http://www2.informatik.uni-erlangen.de/IMMD-II/ 

Research /Activities /DiaGen / 
email : minas@ informat ik . uni-er langen . de 

— DiTo: DiTo is a GASE-tool that supports the development of distributed 
applications using UML as the modeling language, Java as the implementa- 
tion language, and Gorba or Java RMI as the distribution middleware. 

URL: http://IST.UniBw-Muenchen.DE/Tools/dito 

email : ansgar @ informat ik . unibw-muenchen . de 

— Pujaba: a programming environment which combines UML class and activ- 
ity diagrams with graph transformation rules (in the form of UML collab- 
oration diagrams). The environment consists of a graphical editor, a Java 
code generator, and a Java reverse engineering tool. 

URL: http://www.uni-paderborn.de/fachbereich/AG/schaefer/ 
ag_dt/PG/Eujaba/fuj aba.html 
email: zuendorf@uni-paderborn.de 

— GenGEd: GenGEd is a tool which allows for the (mainly) visual definition 
of visual languages and the generation of (syntax-directed) visual language 
editors. Its underlying formalism is the algebraic graph transformation ap- 
proach as implemented by the graph transformation tool AGG. 

URL: http:/ /tfs.cs.tu-berlin.de/^genged/ 
email: maggi@cs.tu-berlin.de 

— Grrr: Grrr allows graph data structures to be visualized; it has a computa- 
tionally complete declarative programming method. 

URL: http : / / www .cs . ukc .ac.uk /people /staff /p j r6 /gdgr /main . ht ml 
email: P.J.Rodgers@ukc.ac.uk 

— Klotho: biochemical compounds declarative database; uses layered graph 
grammars as its modeling language; University of Washington, USA. 
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URL: http:/ /www. ibc.wustl.edu/klotho / 
email: toni@athe.wustl.edu 

— L-Studio: an integrated environment which supports the rule-based model- 
ing and especially the visualization of (growing) plants. It uses Lindenmeyer 
systems (L-systems) as its main underlying formalism. 

URL: http://www.cpsc.ucalgary.ca/Redirect/bmv/lstudio 
email: vlab@cpsc.ucalgary.ca 

— OPTIMIX: compiler optimizer generator; combines DATALOG with graph 
rewriting; University of Karlsruhe, Germany. 

URL: http : / /i44www. info. uni-karlsruhe.de/^ assmann /optimix. html 
email: assmann@informatik.uni-karlsruhe.de 

— PRO GRES: integrated set of tools for editing, analyzing, and executing 
programmed graph transformation systems; supports rapid prototyping of 
graph manipulation tools with Tk/Tcl-based user interface; Aachen Univer- 
sity of Technology, Germany. 

URL: http:/ /www- i3.informatik.rwth-aachen.de/research/progres 
email : progres@i3 . informat ik .rwth-aachen .de 

— PROP: a C++ based pattern matching language; supports term and graph 
rewriting; New York University, USA. 

— SMART: tree- and graph-based structure matching and rewriting tool; 
GMD, Germany. 

— TREEBAG: a picture generation tool which uses tree generating context- 
free grammars and tree transformations as the underlying formalism. 

URL: http:/ /www. inform at ik. uni- bremen.de/^drewes/treebag 

email : drewes@ informat ik . uni-bremen . de 

Graph Transformation Tool Applications: 

— ACACIA: knowledge acquisition for explainable, multi-expert systems; graph 
transformations are used to manipulate Conceptual Graphs (Sowa); INRIA 
Sophia Antopolis, France. 

URL: http://www.inria.fr/Equipes/ACACIA-eng.html 
email: Rose.Dieng@inria.fr 

— Angio Trace Diagrams: for constructing performance models of distributed 
systems; a graph transformation approach is used to analyse and transform 
these diagrams; Carleton University, Ottawa, Canada; now: Software Anal- 
ysis Inc., Sherwood Park, Canada. 

URL: http:/ /home. istar.ca/^angio/ 
email: angio@istar.ca 

— IPSEN: Integrated/Incremental Project Support Environment; built with 
graph grammar engineering technology; Aachen University of Technology, 
Germany. 

URL: http://www-i3.informatik.rwth-aachen.de/research/ipsen 
email : nagl@ i3 . informat ik .rwt h- aachen . de 

— Specification of Software Systems: activities at University (GH) of Es- 
sen, Germany. 
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IX 



URL: http://www.informatik.uni-essen.de/Fachgebiete/SoftTech/ 
email: goedicke@informatik.uni-essen.de 

— SUKITS: process modeling and a posteriori integration of CIM tools; tools 
are specified and generated using a graph transformation approach; Aachen 
University of Technology, Germany. 

URL : http : / /www-i3 . informat ik .rwth-aachen .de /research /sukits 
email: bernhard@i3 .informatik.rwth-aachen.de 

— Visual specification language: activities of SEIS group at Leiden Univer- 
sity, The Netherlands. 

URL: http://www.liacs.nl/CS/SEIS/ 
email: engels@uni-paderborn.de 

Related Topics: 

— Concurrent Clean: functional programming language based on graph trans- 
formations; University of Nijmegen, The Netherlands. 

URL: http://www.cs.kun.nl/^clean 
email: clean@cs.kun.nl 

— GRAS: Graph-Oriented Database System; Aachen University of Technol- 
ogy, Germany. 

URL : http : / /www-i3 . informat ik .rwth-aachen .de /research /gras 
email: bernhard@i3 .informatik.rwth-aachen.de 

— Graph Drawing Database: activities of Arne Frick, University of Karl- 
sruhe, Germany. 

URL : http : / /i44www. info. uni-kar lsruhe.de /^frick /gd / 
email: arne@frick-consulting.de 

— Graph Drawing Server at Brown University, USA. 

URL : http : / /loki.cs .brown .edu: 808 1 /graphserver / 
email: ssb@cs.brown.edu 

— Visual Language Homepage: URL: http://www.cs.orst.edu:80/^burnett 
(this is not an official homepage but it contains comprehensive information 
about visual languages) 
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Term Graph Rewriting and Mobile Expressions 
in Functional Languages 



Rinus Plasmeijer and Marko van Eekelen 



Computing Science Institute 
University of Nijmegen, the Netherlands 

{rinus, marko}^cs .kun.nl 



Abstract. Clean is a functional language based on Term Graph Rewrit- 
ing. It is specially designed to make the development of real world ap~ 
plications possible by using a pure functional language. 

In this paper we first give a short overview of the most important basic 
features of the language Clean among which it^s Term Graph Rewriting 
semantics. Of particular importance for practical use is Clean^s unique- 
ness typing enabling destructive updates of arbitrary objects and the 
creation of direct interfaces with the outside world, all within a purely 
functional framework. 

After this overview we will focus on a new language feature, which is 
currently being added. The new version of Clean offers a hybrid type 
system with both static as well as dynamic typing. Expressions, which 
are dynamically typed, are called Dynamics. With help of Dynamics 
one can create mobile expressions, which can be passed to other Clean 
applications. Dynamics can be used to make plug-ins which will be type 
checked at run-time. Typically, 30% of the code of an application is 
needed for storing (converting data to string) and retrieving (by means 
of a parser) of data. With Dynamics one can store and retrieve not only 
data but also code (!) with just one instruction. 

The implementation effort needed to support Dynamics is quite large: 
it not only involves dynamic type checking but also dynamic type unifi- 
cation, dynamic linking, just-in-time compilation, coding techniques for 
data and version management of code segments. 



1 Clean: a functional language for real world applications 

The Clean [23] system includes a very fast compiler (typically it compiles one 
to two orders of magnitude faster than comparable compilers for pure and lazy 
functional languages) and it generates state-of-the-art, native code. The system 
comprises an Integrated Development Environment, an editor, a project man- 
ager, a linker, a time and space profiling tool and a tool for creating interfaces to 
C. Almost all of this software is written in the language Clean itself. Clean is 
available on a large variety of platforms such as Windows ^95 / ^98 / ^2000 / NT, 
MacOS, Unix (Sun) and Linux (PC). The Clean software can be downloaded 
from the net {www.cs.kun.nl/ ^clean) . 



M. NagI, A. Schiirr, and M. Miinch (Eds.): AGTIVE’99, LNCS 1779, pp. I-I3, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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Clean is currently mainly used in R&D environments. We have over 1000 
administrated users (about 1/3 from industry). There is a Clean mailing list via 
which users frequently communicate about the use of Clean. Clean is commer- 
cially exploited by the Nijmegen University spin-off company Hilt. Among the 
commercial users of Clean are the Canadian Telecom Company Newbridge, 
Microsoft, the Dutch companies Philips, ABN-AMRO, Hollandse Signaal and 
the Dutch Department of Public Works. 

Applications written in Clean include a software development monitoring 
tool, a Java Applet generator, a music composition program, a machine code 
linker, a platform independent I/O library, an editor, an ordered linear resolution 
prover, an integrated software development environment, a projective geometry 
demonstration program, an audio- set software generation and configuration tool, 
a sentence generator for spoken text, a network process profiling tool, a traffic 
light simulation tool and a game library for the development of 2D platform 
games [26]. 

Since the Clean I/O library is available on a large variety of platforms, it 
provides a platform independent interface for reactive Clean programs. Inter- 
active window based Clean programs can be ported to any of these platforms 
without any modification of source code. Programs retain the specific “look and 
feel” offered by the platform being used. 

Haskell [13] and Clean [9] [20] [22] developed independently as descen- 
dants of the language Miranda [24] and influenced by the language Gopher 
(type classes for overloading [16]). The first publication on Clean dates back to 
1987 [9] at the Functional Programming and Computer Architecture conference 
in Oregon where the Haskell committee was formed. 

Clean is a state-of-the-art lazy and pure functional language and as such it 
offers features like higher order functions, currying, lazy evaluation, (cyclic) shar- 
ing, lambda expressions and local definitions (where and let), guards and case 
expressions, patterns, list and array comprehensions, strong typing with Mil- 
ner/Mycroft type inference (with polymorphic types, abstract types, algebraic 
types, synonym types) extended with existentially quantified types, overloading 
via type classes and type constructor classes, predefined types and type construc- 
tors (integers, reals, Booleans, characters, files, lists, tuples, records, arrays), 
strictness annotations in (function and data) type definitions, separate compi- 
lation of modules (with implementation and definition modules with implicit or 
explicit imports). Apart from small differences the most important differences 
between Clean and Haskell are that Clean has graph rewriting semantics, 
offers unique typing, and has a sophisticated library for defining window based 
interactions. These distinctive features of Clean are explained in more detail 
below. 

2 The importance of graph rewriting 

There are several models of computation that can be viewed as a theoretical 
basis for functional programming. The lambda ealeulus (see [6]), being a well- 
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understood mathematical theory, is traditionally considered to be the purest 
foundation for modern functional languages like Scheme [11], ML [10], Haskell 
[13], Miranda [24] and Clean [9] [20] [22]. And indeed, the merits of many pro- 
gramming languages concepts have successfully been investigated in the (typed) 
lambda calculus. However, all state-of-the-art implementations of functional lan- 
guages are not based on lambda calculus but on graph rewriting. Consider with 
A-calculus semantics the following definition: square x = x * x. Then, reduc- 
ing the following function application square (1 + 1) would give (1 + 1) * 
(1 + 1) with multiple copies of the argument expression. In practice such an 
argument may contain a huge calculation, which is copied many times. Graph 
rewriting semantics avoid this multiple evaluation by sharing the argument in 
the graph structure. Since the argument is then addressed via two pointers, the 
evaluation takes place only once. The resulting evaluation order corresponds to 
call-by-need semantics. 

Clearly, programmers writing real world applications will have to address 
much more practical aspects that are all related to graphs and graph rewriting: 
they worry about data structures, sharing, cycles, space consumption, efficiency, 
updates, input/output, interfacing with C, and so on. Many of these problems 
and solutions can be understood better when the semantics of the functional 
language are extended incorporating graph structures. This is the reason we have 
sought for a model of computation that is sufficiently elegant and abstract, but 
at the same time incorporates mechanisms that are more realistic with respect to 
actual implementation techniques. Graph rewriting systems [7] extend A-calculus 
with the explicit notions of pattern matching and sharing. 

Graph rewriting has been proven to serve very well as a uniform framework: 
the programmer uses it to reason about the program and to control time and 
space efficiency of functions and graph structures; the implementer of the com- 
piler uses it to design optimisations and analysis techniques that are correct with 
respect to the graph rewriting theory; the theoretician uses it to build theory for 
complex analysis of graph rewriting systems used in concrete implementations; 
the language designer uses it to base the functional language constructs directly 
on the graph rewriting semantics. 

In our opinion it was the availability of this common framework in Clean 
that played the key role in various activities that usually are far apart: extending 
the language with new vital constructs that otherwise would never have been 
found, keeping the compiler fast and correct, enabling the programmer to write 
efficient programs and keeping the theory to the point. 

The programmer's choice of representation, e.g. the explicit use of a cycle, 
can drastically influence the algorithmic complexity (e.g. bringing it down from 
exponential to polynomial). The use of graph rewriting semantics makes it pos- 
sible for the programmer to model this choice. 
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3 The importance of purity 

An important aspect of Clean is that it is a pure functional language, like Me 
RANDA and Haskell. Examples of impure functional languages are Lisp [18], 
Scheme [11], ML [10], CaML [17] and Erlang [5]. In a pure language the re- 
sult of a function application will, under all conditions, be determined solely by 
the value of its arguments. This key property is called referential transparency. 
A consequence is that it never makes any difference where and under what con- 
ditions a function is applied. A function will always react in exactly the same 
way. This has important consequences. Assignments do not exist and side ef- 
fects cannot occur. A functional programmer can look at a piece of code and 
simplify or change it without the need to carefully scan every other function or 
procedure for context dependency. Reasoning about the result of a function can 
be done by uniformly substituting its definition at the place of the application 
regardless of the context. A C or Java programmer will always risk the surprise 
that calling the same function twice has totally different effects. In Clean, par- 
allel and distributed evaluation of any function application is allowed without 
worrying about changing the outcome of a program. In mathematics referential 
transparency and uniform substitution are two of the most fundamental prop- 
erties of reasoning which are used in all fields of mathematics (and which have 
been used extensively by every high school student). 

One might wonder how on earth one can write serious applications in a 
functional language and retain purity? For instance, one certainly would like to 
be able to change the contents of a file in a destructive way. Let^s look at a simple 
example: assume that one would allow an impure function fwritec, which as a 
side effect appends a character to a file on disk returning the modified file. For 
example, the function fwritec can be used to construct the following function 
AppendAandB which returns a tuple: 

AppendAandB file 

= (fwritec file, fwritec file) // illegal in Clean ! 

The contents of the resulting tuple will depend on the order in which the 
two applications of fwritec will be evaluated. If fwritec ^a^ file is evaluated 
before fwritec ^b^ file then ^a^ is appended to the file before ^bV If the 
function applications are evaluated the other way around ^b^ will be written 
before ^aV This violates the rule of referential transparency: the result of a 
function application is not solely determined by the value of the arguments but 
also depends on the context, in this case the order in which the functions are 
evaluated. The purity is lost and the example illustrates that it is indeed hard 
to reason about the effect of such functions. 

In the first generation of functional languages one simply did not know how 
to solve this problem and gave up purity. With this many nice mathematical 
properties are lost as well: mathematical analysis and every day reasoning about 
a program becomes almost as hard as in imperative languages such as C. One 
has to establish for each function whether it has such an impure aspect (or calls 
a function which has) before one can continue with standard reasoning. 




Term Graph Rewriting and Mobile Expressions in Functional Languages 



5 



If one is not prepared to give up purity then definitions such as AppendAandB 
have to be made impossible. Observe that the problem is caused by having 
several dangerous function applications (fwritec), which can be applied at the 
same time on the same object (file). 

The solution taken in Clean is based upon the observation that the problem 
does not occur when there is only one pointer to an updateable object (such 
as a file). The previous example is disallowed because it makes two copies of 
the file pointer. A Clean programmer can explicitly pass around destructively 
updateable objects like any other object but he can only do this in a safe manner 
such that no violation of referential transparency is possible. This is guaranteed 
by the uniqueness type system of Clean. 

In Haskell the problem is solved in the following way. A program yields a 
higher order function (e.g. fwritec ^a^) and the system applies this function 
to the hidden state (via a so-called monad) to be updated (e.g. file). By using 
function composition a whole sequence of state transitions can be requested. This 
system is simple but has the disadvantage that all objects to be destructively 
updated must be maintained by the system in a single state, which is kept hidden 
for the programmer. Clean does not have this restriction. One can have arbitrary 
states, which can be passed around explicitly. Such a state can be fractured into 
independent parts (e.g. distinct variables for the file system and the event queue). 
For a comparison on the two approaches see [25]. 

4 Uniqueness typing 

In order to determine (and specify) that an argument is passed in such a way that 
the required updates are possible and referential transparency is maintained, 
a type system has been added to Clean which derives so-called uniqueness 
properties [8]. A function is said to have an argument of unique type if there 
will be just a single reference to the argument upon evaluation of the function. 
Uniqueness typing can be seen as linear logic [14] extended with sub typing and 
with a strategy-aware reference analysis. It is important to realize that talking 
about references to nodes is very natural in Clean since Clean is based on 
term graph rewriting. From its semantics it directly follows that it is safe for 
the function to re-use the memory consumed by the argument to construct the 
function result. For file I/O this means that the function f write c should demand 
its second argument to be of unique type. In the type this uniqueness is expressed 
with an * attached to the conventional type. Consequence: the result of applying 
fwritec (the new file) can be constructed by destructively updating the unique 
argument (the old file) as one is used to. 

The function AppendAandB defined previously will not type check. But we 
can easily write a function that performs the two updates in sequence. 

fwritec:: Char *File -> *File 

// type of predefined function which appends a character to a file 
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AppendAbef oreB : : *File -> *File 
AppendAbef oreB file = fileAB 
where 

fileA = fwritec file // append ^a^ to the file 

fileAB = fwritec ^b^ fileA // then append ^b^ to the file 

To be able to write these definitions in a more natural order, Clean offers 
a special let expression, indicated by a #. It^s scope rules allows to reuse names. 
The function AppendAbef oreB can also be defined as follows: 

AppendAbef oreB : : *File -> *File 
AppendAbef oreB file 

# file = fwritec ^a^ file // append ^a^ to the file 

file = fwritec ^b^ file // then append ^b^ to the file 
= file // and return the resulting file 

The uniqueness type system is an extension on top of the conventional type 
system. The uniqueness type attributes can be inferred or checked by the com- 
piler, as all other type information. Offering a non-unique object to a function 
that requires a unique one is not type correct. Offering a unique argument if a 
function requires a non-unique one is fine: the type system can coerce a unique 
object to a non-unique one. 



4.1 Using Uniqueness Information 

The Clean uniqueness type system is very powerful and flexible, it can be used 
to solve several problems. 

First of all it can be used for interfacing the pure functional world in an 
efficient way with the impure world outside. For any (impure) foreign function, 
method or procedure an interface function can be written in Clean. The object 
updated destructively in the imperative world can be protected by a uniqueness 
type in the functional world ensuring that within Clean the object can only 
single-threadedly be passed around from function to function retaining the purity 
of the language. In this way external objects which have an inherently unique 
physical representation (such as a file) can be rewritten, a record in a database 
can be updated, a picture in a window on a screen can be animated. For the 
language C a tool is available to easily generate interface functions. 

Uniqueness typing can also be used to make a functional program more ef- 
ficient aiiowing reusing memory of predefined and user defined data structures. 
This can give a huge gain in efficiency. For instance. Clean offers the predefined 
type array and functions with which unique arrays can be destructiveiy manip- 
uiated (as efficient as in C, about 20 times faster than arrays in the Giasgow 
Haskell compiier [15]). 

Clean^s “unique” features have made it possibie to predefine (in Clean) 
a sophisticated and efficient I/O iibrary giving a program access to a unique 
outside world and its unique sub-components (e.g. file system, event queue, op- 
erating system). The I/O library [2] [3] [4] written in Clean enables a Clean 
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programmer to specify interactive window based I/O applications on a very high 
level of abstraction. 

5 The importance of mobile expressions 

With uniqueness typing, Input/Output can be incorporated in a pure functional 
language without any problem. Once you have got the technical ability to per- 
form I/O, you want to use the power of functional languages to do it more 
elegantly and easily as one is traditionally used to. 

In the average application, 30% of the program code is used for doing triv- 
ial I/O. When data is written to a file it has to be converted to an ASCII 
string. When data is read one needs a parser to check the input string. Although 
functional languages offer powerful language features to make life easier (type 
classes for overloading of I/O primitives, parser combinators for easy construc- 
tion of parse functions), still a lot of code has to be written. Would it not be nice 
to read and write complicated data structures from and to a file with just one in- 
struction? In the functional world the difference between data and code is much 
smaller than in the traditional imperative world: functions are first class citizens. 
So, why not have the possibility to communicate any expression (containing data 
as well as code) with just one instruction? And why not communicate with the 
same ease an arbitrary expression from one (distributed executing) Clean ap- 
plication to another? Such mobile expressions can be used to realize plug-ins 
which can be dynamically added to a running application. 

5.1 Cleans’ Hybrid Type System 

When distributed Clean applications communicate expressions with each other, 
it is of course necessary to guarantee the type safety of the messages. Distributed 
applications are generally not developed at the same moment such that the 
consistency of the messages communicated between them cannot be checked 
statically by inspecting the source code. So, we need a dynamic type system 
to check the type consistency of the mobile expressions at run-time. But we do 
not want the entire Clean language to become a dynamically typed system as 
well. Static type checking has too many advantages, which we want to retain: 
error reporting in a stage as early as possible, program code is better readable, 
much more efficient code can be generated. To get the best of both worlds we 
need a hybrid type system in which some parts of a program are type checked 
dynamically, while the largest part is still checked statically. 

The concept of objects with dynamic types. ^ or dynamics for short, as intro- 
duced by Abadi et al. [1], can provide such an interface ([21]). We distinguish 
between expressions on which only static type checks are performed (statics) and 
expressions on which dynamic type checks are performed (dynamics). Values of 
this dynamic type are, roughly speaking, pairs of a value and an encoding of its 
corresponding static type, such that both the value as well as its type can be 
inspected at run-time. 
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From a statically point of view all dynamics belong to one and the same 
static type: the the type Dynamic. Applications involving dynamics are no longer 
checked statically, but type checks are deferred until run-time. Almost any 
Clean expression can be explicitly projected from the statical into the dy- 
namical world and vice versa. 



5.2 Converting Statics into Dynamics 

The projection from the statical to the dynamical world is done by pairing 
an expression with an encoding of its type. The expression is wrapped into a 
container. The type of the expression is hidden from the statical world and the 
container receives the statical type Dynamic. 

A static expression of type T can be changed into a dynamic expression of 
static type Dynamic using the keyword dynamic. In principle, an expression of 
any type can be turned into a dynamic, e.g. functions of polymorphic type and 
functions working on unique types. Examples of expressions of type Dynamic 
are: 



(dynamic 


True : 


: Bool) 




: Dynamic 


(dynamic 


fib : 


: Int -> 


Int) : 


: Dynamic 


(dynamic 


reverse : 


** [a] -> 


[a] ) : 


: Dynamic 



For instance, in the example above, the function reverse of static polymor- 
phic type [a] -> [a] is converted into a dynamic using the keyword dynamic. The 
resulting expression (the container containing the pair reverse and the encodng 
for its type [a] -> [a] to be used for type checking at run-time) is of static type 
Dynamic. 

Notice that Clean has a type inferrencing system, so none of the static types 
need to be specified explicitly. Static types can be left out and will automatically 
be inferred by the static type system. So, instead of the expressions above, one 
can also just write down: 

dynamic True 
dynamic fib 
dynamic reverse 



5.3 Converting Dynamics into Statics 

Next to projection into the dynamic world, there has to be a mechanism to in- 
spect expressions of type Dynamic and retrieve their original encoded static type 
such that this type information can be used in the statically typed world again. 
For this purpose the pattern match mechanism of Clean has been extended 
to describe pattern matching on types as well. An example of such a dynamic 
pattern match on types is: 
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Snapshot : : Dynamic -> JPeg 

Snapshot (pict ::JPeg) = pict 

Snapshot (movie :: MPeg) = to JPeg movie 

Snapshot else = Default JPegPicture 

If the type encoded in the dynamic matches the statically specified type in the 
pattern, and all other patterns match as well, the corresponding rule alternative 
is chosen. The type demanded in a pattern on the left-hand-side can now safely 
be assumed in the right-hand-side of the corresponding rule alternative. The 
type correctness of this can all be checked statically, since the type pattern is 
explicitly specified in the pattern and therefore known statically. 

Type Pattern Variables The type patterns need not fully specify the de- 
manded type: they may include type pattern variables j which match any sub 
expression of the dynamic's type. If such a match has been successful, the sub 
expressions are bound to the type pattern variables they have matched. So, a 
full-blown run-time unification is used during matching of dynamics. A success- 
ful unification leads to a substitution for the type pattern variables and possibly 
for the (polymorphic) variables of the actual type. 

The following function is polymorphic in the types of its arguments (and its 
result). It checks whether its first argument is a function and whether the type 
of its second argument matches the input type of that function: 

dynamic Apply : : Dynamic Dynamic -> Dynamic 
dynamicApply (f::a->b) (x: :a) = dynamic (f x: :b) 
dynamicApply else = dynamic "dynamic type error" 

Now 

Start = dynamicApply (dynamic f ib: : Int->Int) (dynamic 7::Int) 
will reduce to 
dynamic 21::Int 
and 

Start = 

dynamicApply (dynamic reverse : : [a] -> [a] ) (dynamic [ 1 , 253 ] : : [Int] ) 

will reduce to 

dynamic [ 352 , 1 ] : : [Int] 

In Pil [21] it is shown that the type patterns to match on can be generalized 
(by using a special kind of overloading) such that type restriction imposed on a 
dynamic type is determined by the static context in which the function is used 
(so called Type Dependent Functions). Type Dependent Functions allow us to 
bring the types of dynamics locally specified into the scope of the type of the 
corresponding function. 
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lookup : : [Dynamic] -> a | TC a 
lookup E(x::a):xs] = x 

lookup [x:xs] = lookup xs 

lookup [ ] = abort "dynamic type error in lookup function" 

This more powerful form of abstraction can be very convenient. For instance, 

the lookup function above will lookup the first element in the list of dynamics 
that is of the required type. This type will depend on the static type required 
by the environment in which the lookup function is used. The type variable 
a will be unified with a type that depends on the application of lookup. The 
encoding of this type (which is indicated by the type class constructor TC a) is 
passed as additional argument to lookup such that it can be used in the pattern 
match of the first function alternative. As a consequence, lookup dynamiclist 
+ 3 will add 3 to the first dynamic in the list containing an integer value. But, 
sinus (lookup dynamiclist) will take the sinus of the first real value stored 
in the list of dynamics. So, type dependent functions allow a flexible integration 
of statically and dynamically typed expressions. The static context can impose 
restrictions on the dynamic type being demanded. 

5.4 Communicating Dynamics 

A programmer can store a dynamic to a file in a similar way as he can write a 
character to a file. It can be done with just one function call. Reading a dynamic 
from a file can be done in a similar way too. 

writeDynamic : : Dynamic *File -> *File 

// predefined function which appends a dynamic to a file 
readDynamic: : *File -> (Bool, Dynamic, *File) 

// predefined function which reads a dynamic from a file 
sendDynamic: : Dynamic ^Channel -> ^Channel 

// predefined function which sends a dynamic across a channel 

Dynamics are very useful for the communication between distributed pro- 
grams. Any data structure or function can via a dynamic be communicated over 
one and the same channel. A dynamic can be send by applying the function 
sendChannel to the communication channel (which e.g. can be a TCP/IP con- 
nection). To receive a dynamic the programmer has to define a callback function 
that will be applied automatically when the message has been received. These 
communication primitives are offered by the standard Clean I/O library. The 
receiving application has to test on the actual type stored in the dynamic as 
shown in the previous section. 



5.5 Implementation 

Dynamics are very convenient for the programmer, but hard to implement, in 
particular in a heterogeneous distributed environment. 
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First of all one needs a platform independent format for storing and retrieving 
of dynamics. A dynamic in a program consists of an expression and a decoding 
of the static type of that expression. 

The expression is a term graph which might contain shared sub graphs and 
which can be cyclic as well. To avoid duplication of work it is important that 
this sharing is maintained. One can imagine several storage schemes to store a 
dynamic. One might optimise for compactness or for ease of access. Or one might 
prefer lazy reading instead of eager reading e.g. when the dynamic to read in is a 
huge structure. So, although one can store and retrieve a dynamic with just one 
instruction, several options for doing that need to be given to the programmer. 

When a dynamic is read in, the type has to be checked. To check for type 
consistence, not only the type of the expression itself has to be available. Since 
user-defined types are possible in the form of algebraic data types, one also has 
to ensure that the definition of the types in the program and the definitions of 
the types used in a received dynamic are identical. So, not only the type of the 
expression, but all type definitions involved have to be stored with it as well. 
And then, of course, dynamic typing need to be implemented, including dynamic 
unification and error handling. 

But the most complicated things to deal with are caused by the fact that 
dynamics can contain partially applied functions as well as unevaluated expres- 
sions (since Clean is a lazy language). To be able to evaluate a function stored 
in a dynamic, a program needs the corresponding code. This code should be 
available in a platform independent format. Fortunately, we have such a format. 
The Clean compiler generates platform independent abstract machine code, 
ABC-code. The ABC-code is a kind of byte code, which is compiled by the code 
generator to native machine code (object code). So, if a dynamic is communi- 
cated from one application to another, also the code has to be made available. 
If the applications run on the same platform, the corresponding object code can 
be used. Otherwise the ABC-code has to be transmitted and just-in-time com- 
piled to object code. A dynamic linker has been developed (in Clean) which 
can dynamically link the code to the running receiving application. In this way 
one can add plug-ins to a running Clean application and use the dynamic type 
system to test the type correctness. 

Finally, another important implementation issue is the version management 
of dynamics. One can imagine a dynamic stored somewhere in a file in which 
some function is used. Now, suppose that one discovers a bug in the function 
definition and repairs it. When the dynamic is now being used one also would 
like to incorporate the new function definition. However, one can also imagine a 
complete new version of the software involved. Using the new definition instead 
of the original one might now become dangerous. Concluding, one needs a version 
management system to determine which version of the code should be used. 
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6 Current situation and future developments 

Dynamics are part of the new Clean system (version 2.0). Although the im- 
plementation is not finished yet, most kernel facilities (dynamic type checking, 
dynamic unification, just-in-time code generation, dynamic linking) have been 
implemented and work. We believe that the availability of dynamics open a new 
world of dynamically extending applications. For instance, one can use it to 
make a kernel operating system, which is initially very small but which grows 
when new facilities are being used. One can also use it to dynamically repair or 
modify an application which cannot be stopped. Examples of these are telephone 
switch systems, airline reservation systems, or the communication software in a 
satellite. One can use to store the complete status of a program such that one 
can pick up the work the next day in exactly the state as one left it (persistent 
programs). Or, one can simply use to store and retrieve the settings of a program 
with just one function call or fetch a plug-in from the internet. 

The new Clean system will offer many facilities: sophisticated libraries, 
which optionally can be included, static linking, dynamic linking, profiling tools, 
a debugging tool and even a dedicated proof system is under development [19]. 
To be able to use all these facilities and tools in a user-friendly way, a complete 
new Integrated Development System has been designed and implemented which 
makes use of the new I/O library. All the software is written in Clean. See our 
WWW-pages {www.cs.kun.nl/ ^clean) for the latest news. 
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Abstract. In this paper^ we investigate the notion of transformation 
modules as a structuring principle for the specification of graph trans- 
formation systems which provide a collection of operations on graphs. 
Based on the notion of transformation units , a concept that allows to 
specify binary relations on graphs, a transformation module consists of a 
set of transformation units. To be able to distinguish between hidden and 
public operations, a module has an export interface. Moreover, there may 
be an import interface and a formal parameter. The import interface al- 
lows the use of transformation units which are known in the environment 
of a module. The formal parameter consists of formal parameter units 
which specify operations on graphs in a loose way. A formal parameter 
unit may be instantiated by an exported transformation unit of another 
module through module composition. 



1 Introduction 

Graph transformation is a rule-based framework for the modeling and analysis of 
information-processing systems the states of which are represented by graphs in 
a natural way. Although one encounters a number of applications of this kind in 
the literature (see, e.g., the Handbook of Graph Grammars and Computing by 
Graph Transformation and its Vol. 2 in particular [Roz97,EEKR99,EKMR99]), 
graph transformation is not yet used frequently in practice. However, there is 
some hope that this may change in the future and that graph transformation 
will be recognized as a useful and adequate specification and programming ap- 
proach (see, e.g., Andries et al. [AEH+99] for a more detailed discussion). The 
perspectives depend on many factors one of which is the quality and availability 
of structuring principles. 

In this paper, we contribute to the topic of structuring and study the notion 
of transformation modules and their composition which allow the modular devel- 
opment of large graph transformation systems from small entities. As a transfor- 
mation module is intended to provide a collection of operations on graphs and a 

* Partially supported by the EC TMR Network G ETC RATS (General Theory of 
Graph Transformation Systems) and the ESPRIT Working Group APPLIGRAPH 
through the University of Bremen. 
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transformation unit is a proven concept for the specification of a single operation 
on graphs (see [KK96,KKS97,AEH+99,KK99]), a transformation module encap- 
sulates a set of transformation units. To be able to distinguish between hidden 
and public operations, a module has an export interface. Moreover, there may be 
an import interface and a formal parameter. The import interface allows the use 
of transformation units which are known in the environment of a module. The 
formal parameter consists of formal parameter units which specify operations 
on graphs in a loose way. A formal parameter unit may be instantiated by an 
exported transformation unit of another module through module composition. 
It turns out that the result of a sequence of compositions depends only on the 
assignments of actual export units to formal parameter units, and not on the 
order in which successive modules are composed. In other words, it is meaningful 
to develop a graph transformation system as a network of small transformation 
modules which are connected by parameter assignments. 

To avoid unnecessary technicalities, we consider operations on graphs as bi- 
nary relations, i.e. as input-output relations between initial and terminal graphs, 
because this is the most intuitive and straightforward interpretation of rules and 
rule applications in a rule-based framework. However, operations with arbitrary 
arity on various types are implicitely available because many data structures 
have nice graphical representations and a number of graphs can be seen as a 
single graph through the disjoint union. 

The paper is organized in the following way. The central concept of trans- 
formation modules is introduced and discussed in Section 4 based on transfor- 
mation units which are recalled in Section 2 and on formal parameter units 
which are introduced in Section 3 as a counterpart to transformation units with 
loose semantics. In Section 5, we define the composition of modules which al- 
lows the instantiation of formal parameter units of one module by export units 
of another module. All concepts are illustrated by a running example based 
on Dijkstra^s shortest paths algorithm. In Section 6, we discuss the relation of 
transformation modules to other module concepts in the framework of graph 
transformation. In contrast to the notions in [TS95,SW98,GRPS99] which are 
based on particular graph transformation approaches, transformation modules 
are approach independent, i.e. independent of a particular choice of graphs, 
rules, and direct derivations. Actually, our module concept follows the lines 
of research that led to the notion of transformation units and simple modules 
as structuring concepts of the graph and rule centered language GRACE (see 
[KK96,AEH+99,KKS97,HHKK98,KK99]). Apart from some minor differences, 
transformation modules generalize simple modules in the sense that they addi- 
tionally contain formal parameters with a loose semantics. In the conclusion, we 
briefly summarize the presented concepts and outline further research topics. 

2 Transformation Units 

In this section, we recall the concept of transformation units and their inter- 
leaving semantics. Transformation units provide a structuring concept on the 
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basic level of sets of rules (see the references above and, for an early structuring 
concept on a similar level, Kaplan et ah [KLG91]). They will be used as con- 
stituent components of transformation modules in Section 4. A transformation 
unit comprises a set of rules, descriptions of initial and terminal graphs, and 
a control condition. Moreover, it may use other transformation units where the 
control condition regulates how used units and rules interact with each other. All 
necessary ingredients like graphs, rules, rule application, graph class expressions, 
and control conditions are provided in an axiomatic way through the notion of 
a graph transformation approach. 

A graph transformation approach consists of a class of graphs.^ a class of (graph 
transformation) rules a rule application operator yielding a binary relation 
on graphs for every rule r, a class of graph class expressions.^ and a class of 
control conditions. 

If two graphs G and are related through we write G =>r G^ to symbol- 
ize that G^ is directly derived from G by applying r. A graph class expression X 
specifies a set SEM{X) of graphs, namely the set that contains all graphs which 
are valid w.r.t. X. Typically, X may be some logic formula describing a graph 
property like connectivity, acyclicity, and the occurrence or absence of certain 
labels. 

A control condition determines, for example, the order in which rules may be 
applied in. Formally, it specifies a binary relation on graphs, SEMe(G)^ which 
may depend on its environment i.e., on the choice of semantic relations for 
identifiers that occur in the control condition. For instance, a class of control 
conditions could be made up of regular expressions over the names of rules, or 
of priorities assigned to rules, etc. For detailed information about different types 
of control conditions see Kuske [Kus98]. 

Example 1. Let us explain in an informal way the types of graphs and rules used 
in the running example of this paper. We consider the set of all directed graphs 
whose edges are labelled with non-negative integers (which will be interpreted as 
edge weights or, more specifically, distances between nodes). In addition, every 
node may carry an arbitrary (but finite) number of flags, as in the following 
graph. 




Here, the leftmost node has two flags labelled “3” and and the rightmost 
node has a flag labelled ‘ . Intuitively, the flags may be considered as labels of 
the corresponding nodes. Thus, in the following we will also say that a node is 
labelled with x if there is a flag with label x attached to it. 

The type of rules used in examples should be rather intuitive and self- 
explaining. They consist of a left-hand side and a right-hand side. The applica- 
bility of rules may be restricted by application conditions of two types: One may 
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require the absence of certain nodes, edges, or flags, and there may be boolean 
expressions which compare labels. As an example, consider the following rule: 




if m > n 



The rule can be applied to any subgraph consisting of an edge which connects 
two nodes v and where v is labelled with a number m,^ such that m exceeds 
the weight n of the edge, and v' does not carry a flag labelled with o (which is 
indicated by the use of dashed lines instead of solid ones). An application of the 
rule replaces the edge by an edge of weight m and adds a flag labelled with o to 
ub □ 

Based on a graph transformation approach our basic structuring construct 
can be defined. A transformation unit trut consists of 

— an initial and a terminal graph class expression, defining which graphs serve 
as valid initial and terminal ones, 

— a finite set of local ruleSj 

— a finite set uses^ in which transformation units are enumerated that can be 
used by the importing unit, and 

— a control condition which will be indicated by the keyword conds in examples. 

According to the idea of graph transformation, the operational semantics of 
a transformation unit trut yields a binary relation on graphs. Because the se- 
mantics of a transformation unit does not only depend on the local rules but also 
on imported transformation units, we get the following interleaving semantics as 
it is defined in Kreowski and Kuske [KK96]: 

SEM(trut) is the set of all pairs (G, such that 

— G is a valid initial graph according to initial^ and G^ is a valid terminal graph 
according to terminal^ 

— there are graphs Go, • • • , € G with G = Go and Gn — G' such that, for 

i = 0, ...,n, Gi^i =>r Gi for some rule r of trut or (G, G^) € SEM(trud) 
for a used transformation unit trut'. The sequence of graphs is called inter- 
leaving sequence because direct derivations are interleaved with calls to used 
transformation units. 

— Finally, (G, G^) must be allowed by the control condition G. 

Example 2. In this example we illustrate the concept of transformation units by 
discussing a collection of four transformation units whose main unit — which uses 
the other three — yields an implementation of Dijkstra^s algorithm to compute 
shortest paths (see [Eve79] or any other standard book on algorithmic graph 
theory). This example will be used as a running example throughout the paper. 

Dijkstra^s algorithm takes as input a graph with one distinguished node uq 
carrying a flag query ^ say. Its output is the same graph, except that every node v 

^ As a convention, we use the letters frn^n to denote numbers. 
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is labelled with a number n C Nqo , where n is the distance between uq and v. As 
usual, the distance is defined to be the length of a shortest directed path from 
uo to u, where the length of a path is given by the sum of weights of its edges. 
If there is no path from uq to v in the graph then n = oo. 

Let us briefly recall how the algorithm works. The nodes of the graph are 
processed one by one. Nodes which have already been processed are marked with 
a special label x. Initially, no node is marked, uq gets the flag 0 and all other 
nodes are labelled with oo. Now, in a loop the following steps are executed until 
all nodes have been processed: 

(i) select an unprocessed node v whose label m is minimal among all unprocessed 
nodes; 

(ii) for all nodes with label n, if there is an edge from v to of weight I such 
that m + I < n, then replace n by m + I; 

(hi) mark v as processed by adding a flag labelled x . 

Finally, all x -flags are removed and the resulting graph is returned. 

Here is a transformation unit init which provides an input graph with the 
initial labels 0 and oo, respectively. 




Here, Lg^ery denotes the set of all input graphs of the form described above.*^ The 
control condition rj ; r 2 should be read as a regular expression describing the 
allowed sequences of rule applications. More precisely, only pairs of input/output 
graphs G, are considered such that G can be transformed into G^ by applying 
ri an arbitrary number of times and then T 2 exactly once. Together with the 
fact that the nodes of terminal graphs are required to be labelled with 0 or oo it 
follows that, in effect, only transformation sequences are considered in which ri 
is applied as long as possible. The same effect could be achieved by the control 
condition ri! ; T 2 (apply ri as long as possible and then T 2 exactly once). In this 
case the condition on the terminal graphs would be superfluous. 

The selection process in step (i) can be implemented by the following trans- 
formation unit. 



^ Technically, a graph class expression from the given approach should be used to spec- 
ify L query , but wc do iiot waiit to discuss graph class expressions in the present paper. 
Therefore, we may just assume that L query can be defined in the given approach. 
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Here, the absence of initial and terminal means that all graphs are allowed 
as initial and terminal ones (which does no harm because the unit will only be 
applied in a suitable context). By adding a flag labelled o the unit first selects 
an arbitrary unprocessed node. Afterwards it repeatedly moves the flag to an 
unprocessed node which is labelled with a strictly smaller number. This is done as 
long as possible. Thus, upon termination the unprocessed node with the smallest 
label is the one which carries the o-labelled flag. 

Steps (ii) and (hi) can be realised by the following transformation unit. 




Using the first rule, a node reachable from the selected node along an edge 
of weight I gets the new label I + m (where m is the label of the selected node) 
if its present label is a larger one. When all these updates of flags have been 
finished, r2 is invoked in order to mark the selected node as processed. 

Finally, the main unit shortest-pathso uses the other ones, applies them in an 
appropriate order, and eventually removes the x-flags: 



shortest-pathso 


initial: 


L query 


uses: 


initj selectj include 


rules: 


T ^ * 


conds: 


init; (select; include)l; r\ 



Note that select (and thus select; include) applies exactly as long as there 
are still unprocessed nodes. □ 
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3 Formal parameter units 

In this section, we introduce the notion of formal parameter units as another 
prerequisite for the definition of transformation modules in the next section. 

The semantic relations of transformation units are uniquely determined, de- 
pending on the semantics of the used units. In other words, the interleaving 
semantics is of a functional nature meaning that all components of a trans- 
formation unit must be completely defined. However, in intermediate stages of 
development one may prefer to leave parts of a specification incomplete. Or some 
constructions may work for a variety of transformation units, in which case it 
may be worthwhile to leave parts of a specification variable. Both can be accom- 
plished by the notion of formal parameter units. A formal parameter unit defines 
a class of relations on graphs. By assigning a transformation unit to a formal 
parameter unit, a specific relation out of the (perhaps many) possible ones can 
be chosen. 

A formal parameter unit consists of a name FORMAL and a requirement 
REQ^ written FORMAL = REQ^ such that REQ has a loose graph transforma- 
tion semantics LOOSE{REQ)^ which is a class of binary relations on graphs. A 
transformation unit trut satisfies the requirement if SEM(trut) € LOOSE(REQ). 
Moreover, another formal parameter unit FORMAV = REQ satisfies REQ if 
LOOSE(REQ) C LOOSE(REQ), 

There are at least two significant kinds of requirements which provide useful 
formal parameter units. First, one may restrict the class of all binary relations on 
graphs by global requirements on relations like totality, injectivity, surjectivity, 
functionality, etc. As a default requirement one may use none, meaning that no 
restriction is imposed and the formal parameter unit is a placeholder for any 
actual unit. Secondly, the requirement may consist of a pair (C\,C2) of two 
specifications of binary relations SEM(Ci) and SEM(C2) on graphs, which serve 
as lower and upper bounds, i.e., LOOSE{{Ci,C2)) contains all relations R. such 
that SEM{Ci) CRC SEM{C2)^ 

Typical ways to specify binary relations on graphs are pairs / x T of {ini- 
tial and terminal) graph class expressions describing the input/output relation 
SEM{I) X SEM{T). Another possibility is the use of logical formulas, and espe- 
cially monadic second-order formulas, which describe binary relations on graphs 
using quantification over nodes, edges, sets of nodes, sets of edges, membership, 
inclusion, and the usual boolean connectives. And not to forget, the lower and 
upper bound relations may be specified as transformation units. In the next 
section examples of formal parameter units will be presented. 

4 Transformation modules 

Although the concept of transformation units allows the use of other units, every 
unit specifies a single binary relation on graphs. However, it is often desirable 
to group several relations which logically belong together. This leads to the 
notion of graph transformation modules, modules that have import and export 
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interfaces, which can provide more than one binary relation, i.e. offer sets of 
functions to be used by others. 

In this section, the notion of a transformation module is introduced as an 
encapsulation of a set of transformation units. A unit of a module may be defined 
by the use of other units of the module, it may be imported if the unit is known 
in the environment, or it may be a formal parameter which is loosely specified 
and may be instantiated later. Those units which are publicly available belong 
to the export interface of a module. 

If one restricts the consideration to modules with a hierarchical, acyclic use 
structure of their units, the semantics of modules can be directly derived from 
the interleaving semantics of transformation units. The semantics associates a 
semantic relation to each unit of the module depending on the formal parameters. 

A transformation module MOD consists of a body^ BODY an import inter- 
facCj IMPORT.^ a formal parameter^ PAR, and an export interface, EXPORT. 
BODY is a set of transformation units defined by the module, IMPORT is a set 
of identifiers refering to units known in the environment, PAR is a set of formal 
parameter units, and EXPORT is a subset of BODY U IMPORT U PAR pro- 
viding those units of the module which are publicly available. Moreover, every 
body unit is assumed to use only units of BODY , IMPORT, and PAR. 

Note that in contrast to the formal parameters which specify a set of bi- 
nary relations on graphs, the import part consists of names of fully specified 
transformation units which are available in the environment of the module. In 
practice, the imported items could be provided by the export parts of other 
transformation modules, like in Modula-2, for example. 

The latter condition of the definition of transformation modules allows the 
description of the use structure of a module as a directed graph with the (iden- 
tifiers of the) units of BODY , IMPORT, and PAR as nodes and with an edge 
from V to P whenever P uses v. In the following, we restrict our consideration 
to modules with acyclic use structures because - in this case - the semantics of 
a module can be defined along its use structure. 

Initially, the semantic relations of the imported units are given and, for each 
of the formal parameter units, one relation of its loose semantics is chosen. Then 
the following procedure is repeated: A unit gets its interleaving semantics if all 
used units are already provided with semantic relations. The number of steps is 
bounded by the length of the longest path in the use structure. In particular, 
every exported unit defines a semantic relation depending on the choice of the 
actual parameters. 

If the formal parameter is empty, the semantics of a module is uniquely de- 
termined as a semantic relation for each export unit. Such modules are called 
fully specified. If one wants to transform a module with a non-empty formal 
parameter into a fully specified one, the formal parameter units may be instan- 
tiated by exported units of other modules. This kind of module composition is 
discussed in detail in the next section. However, the following example might 
already provide some intuition. 
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Example 3, The transformation units of Example 2 can be turned into a trans- 
formation module paths with one formal parameter unit (and empty import), 
as follows: 



paths 




param: 


preprocessing 


body: 


init, select, include, shortesLpaths 


export: 


shortest-paths 



Here, shortesLpaths is obtained from shortesLpathso by adding preprocessing 
to the list of used transformation units and replacing the control condition by 
preprocessing; init; {select; include)l ; r\. Thus, as the name indicates, the for- 
mal parameter unit may be instantiated in order to add some pre-processing 
phase which modifies the input graph in some manner. The formal parameter 
unit has the form preprocessing = {empty. ^C) where empty specifies the trivial 
lower bound given by the empty relation and C is some logical description of 
the binary relations on L query containing pairs (G, of graphs with the same 
nodes such that G C Gb 

Thus, in every instantiation of the formal parameter unit it can only add new 
edges. Asa transformation module which provides two transformation units that 
may serve as actual parameters, let us consider the module 

direction 

body: idj undirect 

export: idj undirect 

where id, which will not be shown explicitly, is a transformation unit that yields 
the identity on L query j and undirect is as shown below. 



undirect 



initial: 



L 



query 



I 

rules: r: 

I 



terminal: reduced {r) 




Here, reduced {r) denotes the set of all graphs which are reduced with respect 
to r, i.e., those to which r does not apply. Thus, in effect, every input graph is 
turned into its undirected version by adding all the reverse edges. 

If the formal parameter unit in paths is instantiated by id we simply get an 
implementation of the original algorithm computing shortest paths. In contrast, 
if we use undirect as its actual parameter the algorithm computes shortest undi- 
rected paths. In particular, a node u in an output graph will be labelled with a 
number n < oo if and only if it belongs to the same connected component as the 
query node. □ 
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5 Composition 

As indicated by the previous example, a formal parameter unit of a module 
may be instantiated by some export unit of another module. This leads to the 
notion of compositions of modules which yields a composite module by joining 
the components of the modules and replacing the formal parameter unit by the 
corresponding actual unit wherever the formal parameter occurs in the body. 

The composition of modules turns out to be associative: the result of a se- 
quence of compositions does not depend on the order in which successive com- 
positions are performed. As a nice consequence, a module system can be repre- 
sented in a graphical way with elementary modules as nodes and instantiations 
of some formal parameter units of one module by some export units of another 
module as an edge. Such a module structure defines a module by means of mod- 
ule composition if no formal parameter unit is intantiated twice and the module 
structure is acyclic. 

Let MOD and MOD' be two modules and let a be a partial assignment map- 
ping some formal parameter units of MOD to export units of MOD' such that 
for every formal parameter unit par in the domain dom{a) of a, the exported 
unit a{par) satisfies the requirement of par. Then the composition of MOD and 
MOD' through a is denoted by MOD A- MOD' and yields the module with the 
body a(BODY) U BODY' ^ the import interface IMPORT U IMPORT' ^ the for- 
mal parameter {PAR — dom{a))UPAR' ^ and the export interface a{EXPORT)U 
EXPORT' . In addition to taking the union in all components, the composi- 
tion removes the domain of definition of the partial assignment a from the 
formal parameter, denoted by PAR — dom{a), and replaces all occurrences of 
units of dom{a) in BODY and EXPORT by the corresponding units, denoted 
by a{BODY) and a{EXPORT) resp. The export interface is only affected if a 
formal parameter unit par of dom{a) is exported in MOD. Then a{par) is in 
a{EXPORT) and - by definition - in EXPORT' ^ too. In the body, the formal 
parameter unit par may be used by body units and hence occur in control condi- 
tions. In a(BODY)^ a(par) is used instead and replaces par in control conditions. 
An interesting special case of the composition is given by the totally undefined 
assignment. In this case, composition is just the componentwise union. This 
module- building operation is interesting in its own right. To emphasize this, it 
is denoted by MOD + MOD' . 

The aim of composition is to instantiate formal parameter units by actual 
ones such that, by a sequence of compositions, a given module with a non-empty 
formal parameter is transformed into a fully specified module. To achieve this, 
each formal parameter unit must be instantiated eventually. The results of in- 
stantiation depend only on the choice of the actual parameter, but not on the 
order of instantiations. This follows from the observation that the composition 
of modules is associative, i.e. successive composition may be done in any order 

with the same result. More formally, if {MOD A MOD') A MOD" is defined, 

then there are assignments h and h' such that MOD A{MOD' A MOD") is de- 
fined and equals the former composition. The converse holds as well. In both 
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compositions, the same formal parameter units are assigned to the same ex- 
ported units, i.e., a equals b on dom(a), equals V on dom(y) and equals 
b on formal parameter units of MOD which are instantiated by exported units 
of MOD^^ . If the latter assignment is denoted by the compositions above are 
uniquely represented in the following graphical way: 




This observation has a nice consequence. A graph transformation system may be 
specified in a graphical form as a network of modules with partial assignments as 
edges. If the network is acyclic and no formal parameter unit is instantiated more 
than once, it can be transformed into a transformation module by composition, 
which yields a formal semantics. 

The export interface of a sequence of compositions is more or less the union of 
all export interfaces of the involved modules. This is useful if the exported units 
shall be available as actual parameters, but in the end it may be desirable to 
hide some of them. This may be achieved by another module building operation 
accompanying the composition: an export restriction which allows the restriction 
of the export interface of a module to a subset. 

Example As a third and last module discussed in our series of examples we 
shall consider the module radiusu, where k € Noo^ 



radiusk 




param: 


assign-numbers 


body: 


markj restrict 


export: 


markj restrict 



The idea is to apply the unit assign-numbers^ which is supposed to yield 
graphs whose nodes are labelled with numbers in Nqo , and then either 

- mark all nodes labelled by a number less than k and remove the numbers or 

— restrict the graph to the subgraph induced by these nodes. 

We shall omit the explicit definition of mark and restrict^ which is straight 
forward. 

One may now instantiate the formal parameter unit of paths by build- 
ing paths direction, where a{preprocessing) £ {idj undirect} ^ and then in- 
stantiate assign-numbers in radiusu with the exported transformation unit 
shortest-paths of the resulting module. Recalling the discussion at the end of 
Example 3, if a{preprocessing) = id then mark will mark exactly those nodes of 
the input graph with o which can be reached from the query node on a path of 
length < fc, and restrict restricts the input graph to the nodes on such paths. 
However, if we choose a{preprocessing) = undirect and let fc = oo then restrict 
returns the connected component of the input graph the query node belongs to. 
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By associativity, in both cases the same result is obtained by first instantiating 
assign-numbers in radiusk with the exported transformation unit of paths and 
then instantiating the formal parameter unit of the resulting module with id or 
undirectj respectively. □ 



6 Related work 

In the field of graph transformation several other modularization concepts have 
been proposed, a comparison of which can be found in [HEET99]. In this section, 
we briefly discuss the concepts of TGTS-modules [GRPS99], progres packages 
[SW981, DIEGO module systems [TS951, and the structuring principles of [EE961, 
and sketch some main differences between them and transformation modules. 
Note that apart from progres packages, of which a public domain implemen- 
tation is currently being worked out, the different concepts, including transfor- 
mation modules, are not yet implemented. 

TGTS-modules. A TGTS-module consists of an import interface, an export in- 
terface, and a body, each of which is a typed graph transformation system (i.e. 
a type graph and a set of named rules). The import interface can be used within 
the body, i.e. it is related to the body by an inclusion relation. Each rule of 
the export interface is temporarily or spatially refined in the body. A temporal 
refinement of a rule corresponds to a sequential composition of several rules with 
the same semantics. A spatial refinement implements the semantics of a rule by 
an amalgamation of several rules. The type graph of the export interface shows 
parts of the type graph of the body, i.e. it is related to the body type graph by a 
morphism. The semantics of a TGTS-module consists of the derivations specified 
by the export interface. 

Two operations on TGTS-modules are defined: union and composition. Rough- 
ly speaking, the union glues two modules together in a common submodule, and 
the composition instantiates the import interface of one module with the export 
interface of another module. 

TGTS-modules differ from transformation modules at least with respect to 
the following points. First, in contrast to the approach independent transfor- 
mation modules, TGTS-modules are defined over the double pushout approach 
[CMR+97]. Second, a TGTS-module imports, encapsulates, and exports sets of 
rules (together with a type graph) whereas the basic components of transforma- 
tion modules are transformation units (which constitute themselves a structuring 
principle for graph transformation). Third, the composition of TGTS-modules al- 
ways instantiates the whole import interface with the export interface of another 
module and not parts of it as it is the case in transformation modules. 

PROGRES packages. The concept of progres packages is mainly motivated by 
the recent development of UML packages for object-oriented modeling [BRJ98]. 
A package in progres encapsulates a graph schema plus a set of (procedure- 
like) graph transformation operations. The items of a package are provided with 
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visibility tags to determine in which way they can be used by other packages. 
The value of such a tag is either public^ or protected^ or private. Public and 
protected items are either mutable or immutable as well as either redefinable or 
final 

PROGRES packages can be imported, refined, or nested. All public items of 
a package define its export part which can be imported or refined by other 
packages. The protected part can only be refined but not imported. The values 
mutable and redefinable do not impose any restrictions on the visibility of items. 
Immutable resources may not be directly manipulated by graph transformation 
within an importing package. Final items may not be redefined within a refining 
package. The private resources of a package are not visible from its outside. The 
visibility values of a package must obey a set of static semantic rules. Altogether, 
a PROGRES package builds a visibility shell around a set of declarations and a 
system of packages is semantically equivalent to a single package containing all 
declarations of the system. 

In contrast to PROGRES packages there exists no refinement relation between 
transformation modules. Hence, the visibility tags protected, and redefinable have 
no counterpart in transformation modules. The visibility tag mutable allows to 
transform the exported subparts of graph schemes. In transformation modules, 
this view based approach is not included. Moreover, like TGTS-modules, progres 
packages are approach dependent, i.e. they are defined over a specific graph 
transformation approach. 

DIEGO module systems. A diego module system is a directed graph where the 
nodes are EGO modules and the edges use relations. An EGO module consists 
of three graph transformation systems: an import interface, a body, and an 
export interface. The import and export interfaces are related to the body via 
morphisms. Two EGO modules A and B can be related to each other with a use 
relation being roughly speaking a graph transformation system that represents 
those imported parts of A which are exported by B. In particular, A may import 
subrules of BA exported rules and may extend them in its body. 

Each state of a diego module system is represented with a hierarchically 
distributed graph and its semantics is a hierarchically distributed graph grammar 
[Tae96] generating all its states. 

DIEGO module systems are not approach independent. But they support the 
modelling of distributed systems in a structured way which is not provided in 
the concept of transformation modules, yet but should be included in the future. 
As TGTS-modules, an EGO module imports, encapsulates, and exports graph 
transformation systems which in general are not structured (in contrast to the 
transformation units imported, encapsulated, and exported by transformation 
modules). Moreover, diego does not provide control structures for the derivation 
process specified by a module. 

GSPEC modularization. In [EE96] different structuring concepts known from 
various specification techniques are adapted to graph transformation. These con- 
cepts include distributed graphs, inheritance, and import-export interfaces. The 
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basic components of a GSPEC module are graph class specifications, consisting of 
types for graphs and/or graph morphisms, graph transformation operation sym- 
bols, and constraints. They specify graph transformation systems which consist 
of graphs and/or graph morphisms and graph transformation procedures, and 
satisfy the constraints. 

For distributed graph transformation, distributed graph class specifications 
are defined, containing a sequence of local graph class specifications and a global 
one. The local graph class specifications specify graph transformation systems 
which operate on local state graphs whereas the global graph transformation 
system contains the elements of the local ones and provides e.g. operations to 
split (or construct) a current global state graph into (or from) several local ones. 

The inheritance structuring concept is formalized with morphisms and in- 
heritance morphisms on graph class specifications. The former allow to redefine 
inherited types and operations, the latter just allows to add new types and op- 
erations to the inherited ones. 

For the concept of import and export interfaces graph class specifications 
with interfaces are defined consisting of a parameter interface, an import inter- 
face, an export interface, and a body, each of which is a graph class specification. 
The different components of a graph class specification with interfaces are related 
in a certain way by morphisms, expressing roughly speaking that the import, the 
export, and the parameter occur in the body, and that the parameter is part of 
the import and the export. Moreover, there exists an operation which allows to 
construct for each graph transformation system specified by the import a graph 
transformation system specified by the body (for example by rule amalgamation, 
control conditions, etc.). Semantically, each graph transformation system spec- 
ified by the import is related to the corresponding constructed exported graph 
transformation system. 

All GSPEC modularization concepts take graph class specifications as basic 
components, but they are discussed independently from each other. An integra- 
tion of the concepts into one uniform framework is still an open problem. This 
distinguishes the GSPEC approach from the previously considered concrete mod- 
ule systems and from transformation modules. However, none of those concrete 
concepts provides all three structuring principles of GSPEC. 

The abstract definition of a graph class specification also allows for approach 
independence, i.e. graph transformation systems of various approaches can be 
translated to graph class specifications. Nevertheless, concrete applications of 
the structuring concepts to existing graph transformation approaches have still 
to be worked out. 

7 Conclusion 

In this paper, the notion of transformation modules has been introduced. It 
yields a structuring principle for graph transformation systems which general- 
izes the “simple modules^^ introduced in [HHKK98]. The main extension with 
respect to simple modules consists in the distinction between imported transfer- 
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mation units and formal parameters. The import refers to specific modules with 
a known semantics whereas a formal parameter has a loose semantics and may 
be instantiated by exported units of other modules, in a way which is not known 
in advance (but satisfies the requirements formulated in the formal parameter). 

Instantiation is done via composition, which is a rather general, but simple 
operation that is based on the union of modules and, thus, is not restricted to a 
pure instantiation of formal parameter units in an intuitive sense. 

Since the concept is quite new (a property which it shares with all other 
advanced modularization concepts for graph transformation systems) there is 
still a lot of work to be done on several levels. On the conceptual and theoretical 
levels the properties and limitations have to be worked out, which will probably 
lead to further extensions, restrictions, or modifications. Concerning the question 
of practical applicability, one has to implement the concepts presented here and 
to use them for the implementation of large-scale applications in order to find 
out whether they are appropriate to handle non-toy examples. It seems clear, 
however, that the avalability of flexible and powerful structuring concepts is one 
of the central prerequisites if one aims at using graph transformation for the 
specification and implementation of real-world systems. 

Acknowledgement We thank the referees for their detailled and useful com- 
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Abstract. Due to the special requirements of distributed systems^ it 
is important that modeling techniques for this kind of systems offer a 
stringent module concept. Each module has to support the encapsulation 
of data structure as well as functionality also at runtime. Modular graph 
transformation^ presented in this contribution, supports these features. 
Modules are built up of specifications where attributed graphs describe 
the static data structures, whereas the dynamic behavior is modeled by 
the controlled application of graph rules. Rule expressions are used to 
formulate the control flow. 

Within one module, we can state a (weak) preservation of export and 
import behavior wrt. the local behavior in the module^s body in the sense 
that an interface derivation is subsumed by a local derivation if it can 
be performed. Modules may use each other meaning that each import 
interface has to be connected with an export interface in a way that the 
import behavior is subsumed by the export behavior. 



1 Introduction 

Distributed systems are a special challenge for software development concepts, 
languages and tools. Objects and tasks are distributed on different processing 
units and can only be accessed by using certain communication networks. In this 
context, important issues are allocation of object and tasks, remote interaction 
as well as object replication and migration. 

Good progress has been made in the development of programming languages 
for distributed applications as well as middle ware providing distribution ser- 
vices to distributed applications. But only a limited amount of work has been 
reported on extending modeling techniques by new features for the description 
of distribution issues. For this purpose, in particular UML [UML99] and its re- 
cent extension for real-time systems [SR98] promise support which, however, 
seems to be insufficient, at least concerning the development of dynamic dis- 
tributed object structures. Considering especially this issue, graphs and graph 
transformation offer good means for modeling. The data and object structures 
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can be modeled by graphs whereas the dynamic behavior is described by graph 
transformation. 

To manage the complexity of distributed systems the reuse of specifications 
and software is important. Due to this point, modeling techniques should of- 
fer encapsulation and refinement concepts to support the partitioning of a sys- 
tem into concurrent, communicating subsystems. Several approaches to modular 
graph transformation have been considered (see [HEET98] for an overview on 
several module concepts including those for GRACE and PROGRES), mainly 
within an undistributed setting. Within this paper, we consider modular graph 
transformation with a distributed semantics. Comparing our approach with a 
previous one, called DIEGO systems [TS95], a module concept for simple graph 
transformation also coming along with a distributed semantics, our emphasis 
now lies on modules for attributed graph transformation with application con- 
trol. The graphical notation of the modules follows the UML-notation as far as 
possible, i.e. wherever corresponding modeling elements are used. 

Each module contains a type graph to model the object structures which are 
possible in principle. Parts of a type graph (which may overlap) can be declared 
as export interfaces. This guarantees some information hiding and simultane- 
ously allows direct export of certain object structures. Moreover, parts of a type 
graph that are required from the environment may be marked as import inter- 
faces. The graphical notation of type graphs follows that of UML class diagrams. 

The dynamic behavior in a graph transformation module is described by 
methods. A basic operation is modeled by just one graph rule. Complex actions 
are specified by rule expressions which contain the control flow of certain rule 
applications. Rule expressions are graphically represented by restricted activity 
diagrams and, thus, resemble story diagrams [JZ98]. By exporting a method not 
only its name and parameters are published, but also a rule which describes the 
effect of the method on the exportable types. A subset of a module^s rules may 
be declared as import rules. The import rules state what is needed by the body 
methods to be performed and can be used to search for the right exports within 
a network. 

Graph transformation modules communicate by connecting an import of one 
module to an export of another. These connections concern the type graphs as 
well as the methods select those parts of the export interesting for the connected 
import. Since modules may be distributed over several sites, remote interaction 
takes place whenever such an import-export connection is used. 

Semantically modular graph transformation is considered twice: first we in- 
vestigate the behavior relations between different module components. The re- 
lations between body, export and import interfaces may also use composed rule 
applications, corresponding to composed derivations. The important semantic 
property of such a relation is the preservation of behavior in a stronger or a 
weaker sense: Given any derivation of e.g. an export interface, there is a cor- 
responding derivation in the related component (body), such that the visible 
part of this derivation coincides with the export derivation (strong preserva- 
tion), or the corresponding derivation (in the body) fails and may subsume not 
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only the interface derivation but also further actions (weak preservation). The 
weak preservation guarantees at least, that nothing wrong happens; it might be, 
however, that nothing at all happens. 

Furthermore, we take a more general point of view: since our goal is to 
use modular graph transformation to specify distributed systems, we introduce 
moreover a semantics based on distributed typed graph grammars. 

Throughout this paper, all concepts are introduced along a running example 
which is a (simple) distributed file manager. We consider a client-server system 
in which the client may replicate files from the server^s site. The export of the 
server contains replicable files, but usually the client just needs a small amount 
of exported files and replicates only those. 



2 Controlling Role Application by Activity Diagrams 

A graph transformation specification consists of a type graph, a start graph and 
a finite set of methods. Type graphs describe the static structures, i.e. object 
types and the types of their attributes as well as relations. A method has a head 
consisting of a name and a parameter list, and a body. The body may be a rule 
or an activity diagram which may trigger rule applications. Activity diagrams 
are a well-known sublanguage of the modeling language UML [UML99]. 

As an example, we consider a file manager with its typical methods such as 
creating a new file, updating, copying and renaming it and possibly destroying 
it. 

In Figure 1 the type graph as well as the method heads are shown. The 
start graph is empty. The bodies of the update methods are depicted in Figure 
3. The other method bodies are not presented, but should be imaginable. The 
methods mentioned are just a subset of methods which should be supported by 
a file manager. Later on additional methods are introduced dealing with the 
replication of files in a distributed environment. 




Fig. 1. A sample graph transformation specification (without method bodies) 
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A basic operation on object structures can be described by one rule in an 
if-then manner. The left-hand side states the premise which contains tests on 
the existence and non-existence of object structures whereas the right-hand side 
states the actions which have to be performed. If attributes are just tested and 
not changed on the right-hand side, they only occur on the left-hand side. On 
the other hand, attributes which are changed but not tested for rule application 
only occur on the right-hand side. There is no order among these actions except 
the fact that object nodes have to be inserted before they can be set into a 
relation, and deletion has to be done vice versa. Thus, independent basic actions 
may be performed in parallel. 

Sequential execution of actions is expressed by several rules to be applied 
one after the other. To have the possibility to specify the order of rule appli- 
cation we use activity diagrams. We like to express at least the sequential and 
parallel application as well as conditional executions. Thus, in the following we 
use restricted activity diagrams containing the operators in Figure 2. The initial 
state is indicated by a small solid filled circle. A final state occurs as a circle 
surrounding a small solid filled circle. The basic activity of rule application is 
depicted by an active state containing the rulers name with the actual parameter 
list. The execution of already composed methods is triggered in the same way. 
Examples for composed methods are given in Figures 3 and 4. 



CD -CD 



sequential composition 




parallel composition 




decision 



Fig. 2. Operators in restricted activity diagrams 



Figure 3 describes the update storing of a file locally on the server. If the 
file is first updated, rule “newUpdate” is used. This condition is expressed by a 
negative application condition for this rule stating that there is no “Update” - 
node with a relational edge to the file to be updated. A negative application 
condition is depicted by dashed nodes and edges. In the case of a first update, a 
new “Update” -node is created and initialized. If the file is repeatedly updated, 
the “Update” -node has to exist and its attribute values are overwritten by new 
values. 

In Figure 4, the publication of an update performed on the client or server 
site is modeled. Parameter m can have two values: “C2S” (from client to server) 
or “S2C” (from server to client). In the case of “C2S”, the server^s file content 
is updated using the parameter values provided by the client. If a server ^s up- 
date should be published, two cases may occur: there has not been any update, 
then rule “noUpdate” is applicable. Otherwise rule “serverToClientUpdate” is 
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newUpdate (string fn. String c. Date d) repeatedUpdate (String fn. String c. Date d) 




[ ^ localUpdate (string fn, | 


repeatedUpdate 




String c. Date d) ^ 




^ (fn, c, d) 








newUpdate\ 
(fn, c, d) 



Fig. 3. A sample method performing a local update of a file on the server 

triggered which exports the locally stored update. Finally, the temporary stored 
data for the updated file is deleted. 

If a restricted activity diagram does not contain decisions we call it simple, 
because it can be provided with a formal semantics based on typed graph trans- 
formation specifications as given in [GPS98]. In this case, a method^s body is 
semantically equal to a set of composed rules. See section 4 for further details 
where also requirements and first ideas for the extension to non-simple rules are 
given. 

3 Systems of Graph Transformation Modules 

A graph transformation module consists of a body and two sets of interfaces, 
namely import and export interfaces. All these module parts are typed graph 
transformation specifications. The type graphs of the interfaces can be embed- 
ded into the body^s type graph. All export methods - which are just rules - have 
implementations (refinements) into a body method. Semantically, an export rule 
has to be subsumed by the composed rule, restricted to the export types, describ- 
ing the semantics of the body method. Each import method is automatically a 
body method and can be used in other body methods. 

One module uses another one if its import interface is related to the other^s 
export interface. This means that the import ^s type and start graphs can be 
embedded into the corresponding graphs of the export. Moreover, each import 
method is a submethod of the corresponding export one. 

In the following, we consider a simple module structure consisting of one 
server, the file manager, and one client, an application that uses the file manager. 
It is imaginable to extend this scenario by further clients using the file manager. 
Each client has an import interface to the server^s export interface. 

The type graphs of client and server as well as the interfaces together with 
corresponding relations are shown in Figure 5. Here, a typical case is shown. 
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St artUpdate (String fn, String c, String m) noUpdate ( ) 




Fig. 4. A sample method performing an npdate of a replicated file 



Client and server communicate via objects of type “File”, but not all attributes 
are exchanged. While the file name and its contents are communicated, the date 
when it was modified at last and the read and write permissions are set locally. 
Moreover, client and server have local types such as “Dir” for directories, etc. 
All interface types and attributes have a grey background. 

Figure 6 sketches local and distributed operations as well as the relations in 
between. The key operations in a distributed file manager are those which deal 
with export, replication and update of distributed files. While the export of a file 
is locally performed on the file manager, the replication of a file is local to the 
client. Furthermore, there are two different update operations distinguished: a 
local update on the server and a synchronized update on both systems. Clearly, 
this is possible for replicated files only. These four operations are representa- 
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Fig. 5. Relations between type graphs 

fives for each kind of distributed action that may occur in a distributed setting: 
(1) exporting, (2) importing, (3) local and (4) synchronization. They are shown 
in detail in Figures 3, 4, 7, 8, and 9. (In contrast to all the other operations, 
”localUpdate2 and ^publicUpdate” are not just modeled as rules, but rule com- 
positions. Although there are several simple rules to implement these operations, 
only the composed ones are mentioned in Fig. 6 to keep the diagram simple.) 




Fig. 6. Relations between module operations 

In Figures 7, 8, and 9 exported as well as imported nodes have a grey back- 
ground. Doing so, corresponding rules in a body and an interface can be depicted 
in an integrated way. 

Figure 7 shows the server^s operation ” export”. This operation can be per- 
formed locally by the server. But removing files from the export is possible only 
when they are not used remotely. 

Figure 8 shows the rules for replication within the client^s view in an inte- 
grated way. The server^s export rule contains all gray graph parts. The left-hand 
side of the import rule is empty, the right-hand side consists of the grey part. 
The body rule comprises the whole rule shown in Figure 8. Since the export is 
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Fig. 7. Exporting a file for replication and abstract rnle of ^docalUpdate^’ refinement 

not changed but only used, this operation can be performed without changing 
the server^s body. 



Client (Application) 

replicate (String dn, String fn) 



Dir 

dname = dn 



fname = fn 
fcont = c 

Im = Date.todayO; 



V y 




Dir 



dname = dn 





Export View 






File 






fname = fn 
fcont = c 











Fig. 8. Replicating a file (showing the client view in an integrated way) 

In Figure 9 again the client view, here rule “update”, is depicted. The server^s 
export rule looks like the client^s import rule. Update is a synchronized operation 
between client and server. Besides update also retrieval of information where in- 
formation is requested that is not already available in the export can be modeled 
by synchronized rule application. 

4 Semantics of modular graph transformations 

In this section we describe the semantics of modular graph transformation in two 
steps: first we give the semantics in terms of the behavior of its components and 
w.r.t. the relations established between them. Subsequentially, we take a more 





Modeling Distributed Systems by Modular Graph Transformation 



39 




Fig. 9. update in the import interface and in the body of the client 

general point of view: since our goal is to use modular graph transformation 
to specify a distributed system, we give its semantics in terms of a distributed 
typed graph grammar. 



4.1 Behavior relations 

Among the relations between graph transformation specifications described in 
this paper, the export-body refinement relation is the most general one: both 
import-body and import-export relations can be considered as special cases of 
refinements where each rule of the source specification is associated to just one 
rule of the target one. 

Hence, by analyzing the behavior of two graph transformation specifications 
related by a refinement, both the behavior of a single module and the behavior 
of interconnected modules (i.e. of a system of graph transformation modules) 
can be described. 

A refinement of a graph transformation specification is given essentially by 
associating with each of its rules a composition of rules of a refining system. 
We distinguish between strict^ weak and subrule refinements, depending on the 
required relation between the original rules and the associated composed ones. 
The three alternatives are discussed in the following paragraphs. We point out, 
however, that while for strict refinements a formal theory has already been de- 
veloped (see [GPS98]), for weak and subrule refinements the theory is still under 
development. 



Strict refinement Strict refinements have been presented in [GPS98]. They 
require the composed rule to coincide with the translation of the original one to 
the type graph of the refining system. This means that the composed rule must 
not use private types (i.e. types not visible from the abstract specification) in 
the initial and final graph: they have to be hidden in the intermediate steps of 
the composition. 

In [GPS98] the rules are simple DPO rules without attributes and applica- 
tion conditions, and the basic composition operations on graph transformation 
rules are sequential composition by concatenation and parallel composition by 
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amalgamation, corresponding to temporal and spatial refinements, respectively. 
In that case a strict preservation of behavior property can be proved, stating 
that each derivation of the more abstract specification can be translated along 
the refinement into a composed derivation of the more refined one, whose visible 
part (i.e. its restriction to the smaller type graph of the abstract specification) 
coincides with the given abstract one. 

According to [Koc99] and [Gro99], this theory can easily be extended to the 
more general case of DPO rules with attributes and application conditions. 

With this extension, it would be nice to use application conditions for mod- 
eling case distinctions, like if-then-else and ease constructs. The idea would be 
to map each abstract rule p into a set of concrete rules {pi, . . . ,Pn}? representing 
different implementations of p in different contexts, specified by the application 
conditions of each pi . This however is not possible for strict refinements because 
of the condition they require between the abstract rule p and each of its imple- 
mentation pp the translation of p over the refining type graph should coincide 
with Pi and that means that there could be just one implementation (i.e. no 
alternatives are possible). 



Weak refinement A weak refinement demands a weaker condition than a strict 
one: it requires the restriction of the composed rule to the more abstract type 
graph to coincide with the original rule. In this way the composed rule can use 
private types of the refining specification, and only its visible part over the more 
abstract type graph has to coincide with the original rule. 

On one side this relaxed condition allows to model if-then-else and e:ase 
constructs as explained above: an abstract rule is mapped into a set of concrete 
rules (representing different implementations of it) and the requirement stated 
above ensures that the visible part of each refining rule over the abstract types 
coincides with the abstract rule. The same idea can also be used to model re- 
finement by iteration, by mapping an abstract rule to the set of iterations of all 
lengths. 

On the other side, the preservation of behavior property as stated for strict 
refinements does not hold any longer, and only a weak preservation of behavior 
can be proved. Consider a direct derivation on the more abstract system: the 
applied rule corresponds to a composed one in the refining specification but, 
since it can use private types, there need not be a match for it. However, if 
there is a match and its visible part over the abstract types coincides with the 
abstract match, then also the visible part of the refining derivation coincides 
with the abstract one. Thus, the refining system may fail to realize the required 
behavior, but it does not yield wrong results. 

For example, the export view of rule update in Figure 9 (server side) and 
the composed rule publieUpdate in Figure 4 are related by a weak refinement. 
Moreover, the import and body views of rule update in Figure 9 (client side) are 
related by a simple one-to-one weak refinement, too. 
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Subrule refinement A subrule refinement allows an even more relaxed situa- 
tion than a weak refinement: it just requires the abstract rule of a refinement to 
be a subrule of the composed rule restricted to the more abstract type graph. 

The subrule relation between two rules is modeled by a rule morphism (i.e. 
by three graph morphisms relating the three graphs of a rule) and it can be 
required that the resulting squares are either pullbacks or simply commuting 
diagrams. In both cases (pullbacks or commuting diagrams) a subsumption of 
behavior property can be proved: if an abstract rule and a composed rule are 
related by a subrule relation and there are two matches for them such that the 
visible part of the more concrete match over the abstract types coincides with 
the abstract one, then also the abstract derivation is subsumed by the visible 
part of the refining derivation. Even in this case, a match for the composed rule 
need not exist, i.e the refining system may fail to realize the required behavior, 
but it does not yield wrong results. 

Rule morphisms requiring pullbacks yield obviously a stronger subsumption 
of behavior than rule morphisms requiring simply commuting diagrams. On the 
other hand, from the viewpoint of applications, requiring pullbacks means that 
only synchronous actions can be modeled, while requiring simply commuting 
diagrams means allowing asynchronous actions, too. In our running example 
subrule refinements are used to model import-export relations (these are only 
one-to-one subrule refinements). Both, synchronous (see the import and export 
views of rule export in Figure 7) and asynchronous actions (see the import and 
export views of rule replicate in Figure 8) are present. Moreover, subrule refine- 
ment is used within the bodies to model complex local actions. E.g. to define a 
refinement relation for the operation ^localUpdate” defined in Fig. 3 there has 
to be has to be a circular refinement relation on the server^s body. The abstract 
rule is shown in Fig. 10. 




Fig. 10. Abstract rule of ^ToclaUpdate^’ refinement in Fig. 3 



4.2 Distributed semantics 

Our proper aim is to use modular graph transformations for specifying dis- 
tributed systems. Using distributed typed graph grammars, modular graph trans- 
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formation can be provided by a distributed semantics in the following sense: Each 
module has its own local state and can run independently of other modules. To 
perform synchronous communication certain transformation steps have to be 
performed in parallel with corresponding steps of other modules. 



replicate (String dn, String fn) 




Fig. 11. Detailed version of rule replicate in Figure 8 



The schema of a modular graph transformation is a graph having graph 
transformation specifications as nodes and the relations between them as edges 
(the schema of our running example is shown in Figure 5 concerning the type 
graphs and in Figure 6 concerning the module behavior). From the point of view 
of the distributed system specified by the modular graph transformation, this 
graph describes the system structure (local components and network connec- 
tions): it is called network graph and we use it for determining the distributed 
type graph, distributed initial graph and the set of distributed rules character- 
izing a distributed typed graph grammar. 

More precisely, the type graphs (resp. initial graphs) and type graph mor- 
phism of all the nodes and edges in the network graph define the distributed 
type graph (resp. initial graph), while the (composed) rules and relation be- 
tween them define the distributed rules. Note that a distributed rule shows an 
import, an export, a local action or a synchronized action between two modules, 
i.e. each distributed rule is composed of those local rules which are related with 
each others (consider Figure 6). For example, the distributed rule associated to 
rule ^Teplicate^^ is shown in Fig. 11. 

Given a distributed rule, a distributed match for it is given by all the matches 
of the corresponding local rules, and a distributed step consists in applying all 
the local matches. In [TFKV99], it has been shown that if the distributed match 
satisfies the so called distributed gluing conditions^ then the components of the 
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distributed rule can be applied locally and yield a consistent distributed state 
as result of all these applications. 



5 Related Work and Tool Support 

Modular graph transformation as presented in this contribution is based on dis- 
tributed graph transformation. For modular graph transformation, we consider 
a fixed communication mechanism based on import and export interfaces, vis- 
ibility constraints for rules and types as well as refinement relations between 
export and body parts. Two larger case studies for distributed graph transfor- 
mation have been developed until now; a distributed file manager [Roo98], a 
small section of it is used as running example here, and a distributed configura- 
tion management system [TFKV99]. Both specifications apply distributed graph 
transformation with corresponding restrictions mentioned above, thus modular 
graph transformation is used. 

Comparing modular graph transformation to other types of module concepts 
for graph transformation, this is the only one which comes along with a dis- 
tributed semantics and some kind of control flow, even if it is quite restricted at 
the moment. (Compare the introduction.) 

Simple graph transformation offers just one concept for describing the system 
behavior: rules. To model a distributed action on a certain level of abstraction, 
graph rules are attractive and sufficient to describe the behavior. Using rules 
the actions are clearly modeled and global control flow is modeled locally within 
the rules. But to model complex local actions, it soon becomes convenient to 
describe the control flow explicitly. If we allow e.g. operations on directories, 
the actions of a distributed file manager may soon become quite complex. In 
[Roo98] rather complex control flows for rule application are specified using Java^ 
expressions. Especially concerning control flow we have to mention approaches 
to modular specifications on the basis of e.g. state charts (0-charts [HG96] ) 
or Petri nets [GGW98] which may have advantages, but they hardly support 
convenient graphical modeling of data structures. 

Furtheron, a recent extension of UML for real time applications [SR98] offers 
modeling concepts for a modular system design with explicit import and export 
interfaces, called ports. The modularity units are called capsules which com- 
municate only via ports. Similar to classes, capsules may have several instances 
which is not possible in our approach until now. While the system structure may 
be modeled well in this UML extension, there is little support in modeling the 
system dynamics in a modular way. 

To test the practicability of this approach we are currently working on an im- 
plementation of the presented ideas based on the graph transformation machine 
AGG [TER99]^'^. AGG performs algebraic graph transformation with negative 
application conditions for rules. The graphs may be attributed by Java objects 

^ For more information concerning Java see: http://www.javasoft.com 
^ See also: http://tfs.cs.tu-berlin.de/agg. 
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and the rules may contain Java expressions as attribute values which are evalu- 
ated at application time. The AGG environment comprises graph, rule, attribute 
and graph grammar editors as well as a graph transformation interpreter. Cur- 
rently AGG is extended in such a way that import and export interfaces for 
graphs, rules and graph grammars are realized as substructures of those. The 
main attention lays on the synchronization of local transformations related by 
subtransformations in a setting of several AGG-systems possibly running on dif- 
ferent hosts. Each AGG-system may hold one or more graph grammars. The 
communication between import and export interfaces will be realized by remote 
method invocations (RMI) supported by Java. 

6 Conclusion 

Modeling distributed systems by modular graph transformation offers visual 
concepts to describe the network structure, distributed data structures as well 
as the dynamic behavior. The stringent module concept supports the reuse of 
specification parts. A module system is defined in a way that subsumption of 
interface behavior is guaranteed in each module. The behavior of import and 
export interfaces may be weaker related meaning that import interfaces may 
start or stop using export items, although the export is not changed. 

Modular graph transformation in the double-pushout approach as presented 
here, supports the modeling of safe distributed systems where a server always 
knows which parts of its exports are imported by which client. As long as an in- 
formation is needed, it is exported. This suits very well to model e.g. distributed 
software management tools [TFKAT9]. Modeling e.g. WWW applications may be 
better supported by using the single-pushout approach to modular graph trans- 
formation where exported items may be deleted or updated without informing 
the using clients [Koc99] . 

This modeling approach is general enough to consider not only client-server 
systems, but also other kinds of distributed systems, e.g. broadcasting systems. 
In this contribution, we have not considered network reconfiguration. A next 
step is to allow several instances of modules and their creation during runtime 
leading to a highly dynamic networks. This aspect can also be nicely described by 
distributed graph transformation [TFKAT9] which already serves as a semantical 
basis for modular graph transformation. 
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Abstract. Performance characteristics as response time and through- 
put play an important role in defining the quality of software products. 
The software developers should be able to assess and understand the 
performance effects of various architectural decisions starting at an early 
stage, when changes are easy and less expensive, and continuing through- 
out the software life cycle. This can be achieved by constructing and an- 
alyzing quantitative performance models that capture the interactions 
between the main system components and point to the system’s perfor- 
mance trouble spots. The paper proposes a formal approach to build- 
ing Layered Queueing Network (LQN) performance models from Unified 
Modeling Language (UML) descriptions of the high-level architecture of 
a system, and more exactly from the architectural patterns used in the 
system. The performance modelling formalism, LQN, is an extension of 
the well-known Queueing Network modelling technique. The transfor- 
mation from a UML architectural description of a given system to its 
LQN model is based on PRO GRES, a well-known visual language and 
environment for programming with graph rewriting systems. 



1 Introduction 

Performance characteristics play an important role in defining the quality of 
software products, especially in the case of real-time and distributed systems. 
The developers of such systems should be able to assess and understand the 
performance effects of various architectural decisions, starting at an early stage, 
when changes are easy and less expensive, and continuing throughout the soft- 
ware life cycle. This can be achieved by constructing and analyzing quantita- 
tive performance models that capture the interactions between the main system 
components and point to the system’s performance trouble spots. Software Per- 
formance Engineering (SPE) is a technique introduced in [15] that proposes to 
use quantitative methods and performance models in order to assess the per- 
formance effects of different design and implementation alternatives during the 
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development of a system. SPE promotes the idea that the integration of perfor- 
mance analysis into the software development process, from the earliest stages 
to the end, can insure that the system will meet its performance objectives. This 
would eliminate the need for ’date-fixing” of performance problems, a frequent 
practical approach that postpones any performance concerns until the system 
is completely implemented. Late fixes tend to be very expensive and inefficient, 
and the product may never reach its original requirements. 

Although the need for SPE is recognized by industry, there are many barriers 
that prevent its wide adoption, some of which are technical, other related to 
management issues. One of the technical problems is the existence of a cognitive 
gap between the software and the performance domains. Software developers 
are concerned with designing, implementing and testing the software, but they 
are not trained in performance modelling and analysis techniques. The software 
development teams depend usually on specialized performance groups to do the 
performance evaluation work, which leads to additional communication delays, 
inconsistencies between design and model versions and late feedback. 

This paper contributes toward bridging the gap between software architec- 
ture and performance analysis. It proposes a systematic approach, based on 
graph transformations, to build LQN performance models from UML descrip- 
tions of high-level software architectures. The high-level architecture of a system 
describes the main system components and their interactions at a level of ab- 
straction that captures certain characteristics relevant to performance, such as 
concurrency, parallelism, contention for software resources (as software servers 
and critical sections), synchronization, serialization, etc. This paper is a devel- 
opment of previous work by the same authors [8], where an ” ad-hoc” language 
for architectural descriptions was used instead of UML [2]. UML is attractive 
because it is a standard, and is rapidly gaining acceptance in the software in- 
dustry. However, UML is a very rich, sometime informal language, which raises 
a number of yet unresolved issues. This paper is but a step in a longer research 
effort, whose final objective is to implement the proposed model-building tech- 
nique in a tool, in connection with a UML-based CASE tool. By automating the 
construction of the performance models from software architectures, the time 
and effort required for SPE will be considerably reduced, and the consistency 
between the model and the system under development more easily maintained. 
Such a model will be solved with existing performance analysis tools, producing 
much faster feedback for the software development team. 

Erequently used architectural solutions are identified in literature as architec- 
tural patterns (such as pipeline and filters, client/server, client /broker/server, 
layers, master-slave, blackboard, etc.) [3,14]. A pattern introduces a higher-level 
of abstraction design artifact by describing a specific type of collaboration be- 
tween a set of prototypical components playing well defined roles, and helps our 
understanding of complex systems. The paper defines graph transformations 
from a number of frequently used architectural patterns into LQN sub-models. 

The formalism used for building performance models is the Layered Queueing 
Network (LQN) model [17,18,9], an extension of the well known Queueing Net- 
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work model. LQN was developed especially for modelling concurrent and/or dis- 
tributed software systems. Some LQN components represent software processes, 
others hardware devices. LQN determines the delays due to contention, synchro- 
nization and serialization at both software and hardware levels (see section 3 for 
a more detailed description). LQN was applied to a number of concrete indus- 
trial systems (such as database applications, web server [5], telecommunication 
system [16], etc.) and was proven useful for providing insights into performance 
limitations at software and hardware levels. 

The paper is organized as follows: architectural patterns and their represen- 
tation as UML collaborations are discussed in section 2, a short description of 
the LQN model is given in section 3, transformation of a few frequently utilized 
architectural patterns into LQN is presented in section 4, the PRO GRES graph 
schema and the principles for graph transformations are given in section 5, a 
case-study telecommunication system is presented in section 6 and conclusions 
in section 7. 

2 Architectural Patterns and UML Collaborations 

According to [1], a software architecture represents a collection of computational 
components that perform certain functions, together with a collection of con- 
nectors that describe the interactions between components. A component type is 
described by a specification defining its functions, and a set of ports represent- 
ing logical points of interaction between the component and its environment. 
A connector type is defined by a set of roles explaining the expected behaviour 
of the interacting parties, and a glue specification showing how the interactions 
are coordinated. A similar, even though less formal, view of a software architec- 
ture is described in the form of architectural patterns [3,14] which identify fre- 
quently used architectural solutions, such as pipeline and filters, client/server, 
client /broker/server, master-slave, blackboard, etc. Each architectural pattern 
describes two inter-related aspects: its structure (what are the components) and 
behaviour (how they interact). In the case of high-level architectural patterns, 
the components are usually concurrent entities that execute in different threads 
of control, compete for resources, and may require synchronization. Concurrency 
aspects contribute to system performance, and therefore must be captured in a 
performance model. 

The paper proposes to use high-level architectural patterns as a basis for 
translating software architecture into performance models. A subset of frequently 
used patterns (some of which are later used in a case study) are described in 
this section in the form of UML collaborations (not to be confused with UML 
collaboration diagrams, a type of interaction diagrams) [2]. According to the 
authors of UML, a collaboration is a notation for describing a mechanism or 
pattern, which represents ”a society of classes, interface, and other elements 
that work together to provide some cooperative behaviour that is bigger than 
the sum of all of its parts.” A collaboration has two aspects: structural (usually 
represented by a class/object diagram) and behavioural (shown as an interaction 
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diagram). Collaborations can be used to hide details that are irrelevant at a 
certain level of abstraction; these details can be observed by "zooming” into 
the collaboration. The symbol for collaboration is an ellipse with dashed lines, 
and may have an "embedded" square showing template classes. Another special 
UML notation employed in this section is that of an active class (object) which 
has its own thread of control, represented by a square with thick lines. An active 
object may be implemented either as a process (identified by the stereotype 
«process>>), or as a thread. 

The literature identifies a relatively small number of patterns used for high- 
level architecture [3,14]. The following patterns are discussed in the paper: 
pipeline and filters, client/server (with and without an intermediary broker), 
and critical section. The pipeline and filters pattern divides the overall processing 
task into a number of sequential stages, which are implemented as filters con- 
nected by unidirectional pipes. We are interested here in active filters [3] that are 
running concurrently. Each filter is implemented as an active object that loops 
through the following steps: "pulls" the data from the preceding pipe, processes 
it, and "pushes" the results down the pipeline. The way in which the pipelines 
are implemented may have performance consequences, as discussed in section 4, 
and shown in Fig. 3. 
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a) UML collaboration for the forwarding broker pattern 
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b) UML collaboration for the handle-driven broker pattern 



Fig. 1. UML collaborations for two variants of the Client-Server pattern 
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The client-server pattern is one of the most frequently used in today’s dis- 
tributed systems, especially since the introduction of new midware technology 
such as CORBA [7], which facilitates the connection between clients and servers 
running on heterogeneous platforms across local or wide-area networks. Since the 
communication between clients and servers (directly or through brokers) has an 
important effect on performance, different alternatives of the client-server pat- 
tern are considered in the paper. Fig. l.a shows the UML collaboration for the 
client-server pattern with forwarding broker, where the broker relays a client’s 
request to the relevant server, retrieves the response from the server and relays 
it back to the client. (When relevant, the ’’object flow” carried by a message is 
represented in a UML class/object diagram by a little arrow with a circle, while 
the message itself is an arrow without circle. A synchronous message implies 
a reply, therefore can carry objects in both directions). The forwarding broker 
acts as an intermediary between clients and servers in all their interactions. This 
will introduce performance penalties due to network delays, especially when the 
client, broker and server reside on different nodes. Moreover, a central broker 
may become a bottleneck when there are many clients and servers. The handle- 
driven broker (shown in Fig. l.b) tries to mitigates these problems. The broker 
returns to the client a handle containing all the information required to commu- 
nicate directly with the server. The client is then able to communicate directly 
with the server many times, reducing the communication overhead. 

The critical section pattern applies to cases where two or more active objects 
share the same passive object. The constraint « sequent ial>> attached to the 
methods of the shared object indicates that the callers must coordinate outside 
the shared object (for example, by the means of a semaphore) in order to insure 
correct behaviour. Such a synchronization introduces performance delays, and 
must be represented in a performance model, as shown in section 4, Fig. 5. 



3 LQN Model 

A brief description of the LQN modelling technique is given in this section, in 
order make the paper self-contained. LQN was developed as an extension of 
the well-known Queueing Network (QN) model, at first independently in [17,18] 
and [9], then as a joint effort [4]. The LQN toolset presented in [4] includes both 
simulation as well as analytical solvers that merge the best previous approaches. 
The main difference with respect to QN is that in LQN a server may become 
a client to other servers while serving its own clients. An LQN model is rep- 
resented as an acyclic graph whose nodes are software entities and hardware 
devices, and whose arcs denote service requests. The software entities, named 
tasks ^ are drawn as parallelograms and the hardware devices as circles. Different 
nodes play the roles of clients (only outgoing arcs), intermediate servers (both 
incoming and outgoing arcs) and pure servers (only incoming arcs). The last 
usually represent hardware resources (such as processors, I/O devices, commu- 
nication network, etc.) Fig. 2 shows a simple example of an LQN model for a 
three-tiered client/server system: at the top there are two groups of stochastic 
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identical clients. Each client sends requests for a certain service offered by Ap- 
plication task, which represents the business layer of the system. Every kind of 
service offered by an LQN task is modelled by a task entry ^ drawn as a paral- 
lelogram "slice”. An entry has its own execution times and demands for other 
services (given as model parameters). In this case, each Application entry re- 
quires services from two different Database entries. As the Database can process 
several requests concurrently, it is modelled as a multi- server (z.e., a set of iden- 
tical servers sharing a common queue). Each multi-server replication models a 
"virtual" thread that serves a request at a time. (Virtual threads may be imple- 
mented either as software threads of the same process, or as a set of identical 
processes that are serving requests from a common queue). Every software task 
is running on a given processor shown as a circle; more than one task can share 
the same processor. The word layered in the name LQN does not imply a strict 
layering: a task may call other tasks in the same layer, or skip over layers. All 
the arcs used in this example represent synchronous requests, where the sender 
of a request message is blocked until it receives a reply from the provider of ser- 
vice. It is possible to have also asynchronous request messages, where the sender 
doesn’t block after sending a request, and the server doesn’t reply. Although not 
explicitly illustrated in the LQN notation, each server has an implicit message 
queue where the incoming requests are waiting their turn to be served. Servers 
with more than one entry still have a single input queue, where requests for dif- 
ferent entries wait together. The default scheduling policy of the queue is EIEO, 
but other policies are also supported. Typical results of an LQN model are re- 
sponse times, throughput, utilization of servers on behalf of different types of 
requests, and queueing delays. LQN was applied to different applications (such 
as databases, telecommunication systems, web servers). The model results help 
to identify software and/or hardware bottlenecks [6] that limit the system per- 
formance under different workloads and resource allocations. 
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4 Transformations of Architectural Patterns into LQN 

A software system contains many components involved in various architectural 
connection instances (each described by a pattern/collaboration), and a com- 
ponent may play different roles in different patterns. The transformation of the 
architecture into a performance model is done in a systematic way, pattern by 
pattern. As expected, the performance of the system depends on the performance 
attributes of its components and on their interaction. Performance attributes are 
not central to the software architecture itself, but must be specified by the user 
in order to transform the architecture into a performance model. Such attributes 
describe the demands for hardware resources by the software components: al- 
location of processes to processors, average execution time for each software 
component, average demands for other resources such as I/O devices, communi- 
cation networks, etc. 
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b) Transformation of the Pipeline with Buffer 

Fig. 3. Transformation of two variants of the Pipeline Pattern to LQN models 



Pipeline and Filters. Fig. 3 shows the translation of different versions of 
this pattern, one using asynchronous messages for the pipeline, and the other a 
shared buffer. Each active filter becomes an LQN software server whose service 
time includes the processing time of the filter. In Fig. 3. a, the connector between 
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the two filters is modelled as an asynchronous LQN message. The CPU times for 
send/receive system calls are added to the service times of the two LQN tasks, 
respectively. A network delay for the message can be represented in LQN as a 
delay attached to the arc. 

In the case of a pipeline with buffer, an asynchronous LQN arc is still re- 
quired, but it does not take into account the serialization delay due to the con- 
straint that buffer operations must be mutually exclusive. A third task will 
enforce this constraint. It has as many entries as the number of operations exe- 
cuted by the tasks accessing the buffer (two in this case, "push” and ”pulF). In 
Fig. 3.b, exactly the same architectural pattern has two LQN counterparts, due 
to a difference in processor allocation. The execution of all buffer operations is 
charged to the same processor node in the first case, and to different processor 
nodes in the second case. 




a) Transformation of the Client-Server pattern with Forwarding Broker 
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b) Transformation of the Client-Server pattern with Handle-Driven Broker 



Fig. 4. Transformation of two variants of the Client- Sever Pattern to LQN mod- 
els 



Client-Server patterns. A client communicates with the server through a 
synchronous communication (rendezvous), where the client sends a request to 
the server and blocks until the reply from the server comes back. A server may of- 
fer a wide range of services (represented as the server’s methods) each one with 
its own performance attributes (execution time and number of visits to other 
servers). A client may invoke more than one of these services at different times. 
There are different variants of the pattern, depending if the clients communicate 
directly with the servers, or through a broker. As in the pipeline connection, 
the CPU times required to execute system calls for send/receive/reply commu- 
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nication primitives are added to the service times of the corresponding tasks. 
Software developers of client-server systems are mostly interested in the compo- 
nents that are part of their application, and less in the details of the underlying 
midware, operating system or networking software. The use of UML collabo- 
rations comes in handy, because it allows us to hide unnecessary details. For 
example, client/server applications using a CORE A interface do not have to 
show explicitly the "broker” component in their architecture (as it is not part of 
the software application). Instead, a collaboration (such as illustrated in Fig. 1) 
can be used to indicate the type of desired client/server connection. However, 
the performance model will represent explicitly the broker and its interaction 
with the client and server counterparts. Fig. 4 illustrates the transformation of 
two variants of the client-server pattern with two kinds of brokers. The UML 
architectural descriptions are similar, differentiated only by the kind of UML 
collaboration used. However, their LQN models are quite different, as the con- 
nections have very different operating modes and performance characteristics. 
The forwarding broker (Fig. 4. a) is modelled as a LQN multi-server with as many 
entries as server entries. Each task replication represents a "virtuaf’ thread of 
the broker. Such a thread accepts a client’s request, passes it to the server, then 
remains blocked until the server’s reply comes back and is relayed to the respec- 
tive client. While a broker thread is blocked, other threads get to run on the 
processor on behalf of other requests. Fig. 4.b represents the LQN model for the 
handle-driven broker. A client sends two kinds of messages: one to the broker for 
getting the handle, and the other directly to the desired server entry. Since the 
broker does the same kind of work for all the requests, no matter what server 
entry they need, the broker is modelled with a single entry. 

Critical Section. The transformation of the critical section collaboration 
produces either the model given in Fig. 5. a or b, depending on the allocation 
of user processes to processor nodes (similar to the pipeline case). The premise 
is that a LQN task cannot change its processor node. Since the operations on 
the shared object (z.e., critical sections) may be executed by different threads 
of controls of different users running on different processors, each operation is 
modelled as an entry that belongs to a different task /I to fN running on its 
user’s node. Since these tasks must be prevented from running simultaneously, a 
semaphore task was introduced in the LQN model. The performance attributes 
to be provided for each user must specify critical and non-critical execution times 
separately. 

5 Graph Schema and Transformation Approach 

The graph schema defined according to the PROGRES language [10,11,12] is 
presented in Fig. 6, and represents classes and types of nodes and edges that 
can appear in the graph. The upper part of the figure contains the input schema 
for architectural descriptions and the lower part the output schema for LQN 
models (light-gray nodes). The input schema does not capture all the richness 
of UML, but only those elements that are necessary for converting a high-level 
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Fig. 5. Transformation of the Critical Section Pattern 



architecture into a LQN model. The advantage of basing the transformation on 
architectural patterns expressed by UML collaborations is that such higher-level 
of abstraction artifacts greatly simplify the graph schema and the transformation 
process. For example, in order to define the transformation to LQN, we used 
our knowledge about the behaviour of patterns “hidden” inside the respective 
collaborations, instead of explicitly representing that behaviour in different UML 
diagrams. The later approach would raise considerably the complexity of the 
input language. The disadvantage of using such artifacts is that they have to 
be pre- identified and represented in the schema and the transformation rules, 
hence limiting the extendibility of the transformation process. This disadvantage 
is somehow mitigated by the fact that the number of high-level architectural 
patterns is relatively small. In order to accommodate graphs in intermediary 
translation stages, the two schemas are joined together by three nodes shown 
in dark-gray at the base of the node class hierarchy (NODE, OBJ.TASK, and 
OP_ENTRY). The collaboration nodes representing architectural patterns make 
up a big part of the input schema. Inheritance is useful for classifying the different 
patterns and their variants. ”Role” edges connect the collaboration nodes to the 
architectural component nodes, which are active and passive objects, their links 
and operations. Some node attributes in the input schema represent performance 
information that is ancillary to the software architecture, but has to be provided 
by the user in order to build a performance model. Such attributes represent 
processors and devices used by Active nodes, execution times of OPERATION 
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Fig. 6. Joint graph schema for architectural patterns and the LQN model 
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nodes, and number of visits of CallEdge nodes (which hold the attributes for the 
operation invocation arcs). 

The output schema reflects closely the LQN graph notation presented in 
section 3. The node types are ’Task”, ’’device” and ’’entry”. The LQN arcs may 
represent three types of requests (synchronous, asynchronous and forwarding); 
a parameter indicates the average number of visits associated with each request. 
Since PRO GRES edges cannot have attributes, we represent a LQN arc by three 
elements: an incoming edge, a node carrying the parameter and an outgoing edge. 

Graph transformation rules have been defined for each architectural pattern, 
following closely the transformations described in the previous section. A PRO- 
GRES transaction is executed for every architectural pattern found in the input 
architectural description graph. The translation process ends when all the pat- 
terns have been processed. The final result is a LQN model that can be written 
to a file according to a predefined LQN model format [4]. The following trans- 
formation approach was followed: 

• The collaboration nodes do not have a LQN equivalent, but play an im- 
portant role during the transformation process, by deciding what transaction to 
execute. 

• Each architectural component (z.e., object) is converted to a LQN task, 
which is the reason for introducing a common base class OBJ.TASK in the 
graph schema. However, the correspondence between components and tasks is 
not bijective, as in some cases a single object may generate more than one task 
for the following reasons: to charge correctly the execution times to various pro- 
cessors (as in Eig. 5.b), or to model processes that are not part of the application 
but of the underlying midware (such as brokers in Eig. 4). 

• An object operation is usually converted into a LQN entry, which led to 
defining a common base class OP_entry in the graph schema. There are some 
exceptions, as illustrated in Eig. 3. b and Eig. 5.b, where an operation is converted 
into an entry and a task. 

• Processors and devices, which are attributes in the architectural view, be- 
come full-fledged nodes of type ’’Device” in LQN. This happens because the 
issue of resource allocation is secondary to the software development process, 
but is central to performance analysis. 

An example of a typical transaction performs the graph transformation il- 
lustrated in Eig. 5, where the same architecture can lead to two different LQN 
models depending on the processor allocation. Each of the two transformations 
is described by a different production rule (Eig. 7 shows the rule for the case 
from Eig. 5.b). The transaction proceeds in three steps. In the first step, the 
processor allocation is checked and the appropriate case, a or b, is chosen. In 
the second step, the corresponding production rule is applied repeatedly, once 
for every user of the critical section. Eig. 7 shows that nodes ’1 and ’3, which 
are objects (one representing the user, the other the shared object) are kept as 
tasks on the right-hand-side. Also, the operation node ’4 is kept as an entry. 
Moreover, a new entry 6’ is added to task 1’, because the LQN syntax requires 
that each task must have at least an entry. Node ’5 of type GallEdge, which 
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production TransformCritSectionDiffProc = 




tranfer 6’ .Name:=’ 1. Name & "Entry"; 

8’.Name:= ’4.Name & "CritSect"; 
7’ .FromName:=6’ .Name; 
T’.ToName := 8’. Name; 

7’. Nb Visits := ’5. Nb. Visits; 



9’ .Fro mN ame := 8’.Name; 
9’.ToName :=4’.Name; 
9’.NbVisits := 1; 
10’.Name:= ’1. Processor ; 



Fig. 7. Production Rule for the Critical Section with different processors 



holds the operation invocation attributes (such as number of visits made by the 
user to the critical section) will disappear, being replaced by a whole subgraph 
that connects entry 6’ to entry 4 \ This subgraph contains a new entry 8’ of 
the semaphore task, and two LQN synchronous requests: one from entry 6’ to 
entry 8’ (represented by node 7’ and its ”in” and ”out” edges) and one from 
entry 8’ to entry 4’ (represented by node 9’ and its ”in” and ”out” edges). The 
attribute NbVisit of node ’5 is transferred to node 7 \ while the attribute NbVisit 
of node 9’ is set to one. A processor node 10’ (used by both tasks L and 3’) is 
added on the right-hand-side. The collaboration node 2’ representing the critical 
section is disconnected from all the other nodes on the right-hand-side, but is 
kept until all the users are processed. The last step of the transaction adds a 
new task node and its ’’dummy” processor to represent the semaphore task from 
Fig. 5, and removes the collaboration node 2’. 



6 Caise-Study: A Telecommunication System 

This section presents the architecture of an existing telecommunication system 
which is responsible for developing, provisioning and maintaining various intelli- 
gent network services, as well as for accepting and processing real-time requests 
for these services (see Fig. 8). The system was modelled in LQN, and its per- 
formance analyzed in [16]. Here we consider only the transformation from the 
system’s UML architecture to its LQN model. 
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Fig. 8. UML description of the high-level architecture of a telecommunication 
system 



The real time scenario modelled in [16] starts from the moment a request 
arrives to the system and ends after the service was completely processed and a 
reply was sent back. As shown in Fig. 8, a request is passed through several filters 
of a pipeline: from Stack process to 10 process to RequestHandler and all the 
way back. The main processing is done by the RequestHandler, which accesses 
a real-time database to fetch an execution "script” for the desired service, then 
executes the steps of the script accordingly. The script may vary in size and 
types of operations involved, and hence the workload varies largely from one type 
of service to another (by one or two orders of magnitude). Due to experience 
and intuition, the designers decided from the beginning to allow for multiple 
replications of the RequestHandler process in order to speed up the system. Two 
shared objects, ShMeml and ShMem2, are used by the multiple RequestHandler 
replications. The system was meant to run either on a single-processor or on 
a multi-processor with shared memory. Processor scheduling is such that any 
process can run on any free processor (z.e., the processors were not dedicated to 
specific tasks). Fig. 9 shows the LQN model of the system obtained by applying 
the graph transformations proposed in the paper. The performance analysis of 
the model is presented in [16] and is, unfortunately, beyond the scope of this 
paper. The model was solved with existing LQN solvers [4] and the highest 
achievable throughput was found for different loads and configurations. The 
performance analysis has also exposed some weaknesses in the original software 
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architecture due to excessive serialization at the 10 process and the double buffer, 
which starts showing up when more processing capacity is added to the system. 
After removing the serialization constraints, a new software bottleneck emerges 
at the database, which leads to the conclusion that the software architecture 
does not scale up well [16]. The study illustrates the usefulness of applying 
performance modelling and analysis to software architectures. 




7 Conclusions 

The main challenge for the automatic generation of LQN performance models 
from software architecture descriptions stems from the fact that the two views 
have different semantic, purpose and focus, which must be bridged by the trans- 
lation process. The architectural view represents only the software components 
of the application under development, and may hide operating system and mid- 
ware services, which must be represented, however, in a performance model. On 
the other hand, many details of the architecture are irrelevant to the performance 
model. The issue of resource allocation and resource demands represents another 
important discrepancy between the two views. Another challenge is raised the 
richness of the UML language. This paper is only a step in a longer research 
effort which aims to bridge the gap between software architecture and perfor- 
mance modelling. So far, the graph rewriting formalisms has proven very useful 
in dealing with these challenges. 



62 



Dorina C. Petriu and Xin Wang 



References 

1. R. Allen, D. Garlan, ”A Formal Basis for Architectural Connection”, ACM Trans- 
actions on Software Engineering Methodology, VoL6, No. 3, pp. 213-249, July 19 
49 

2. G. Booch, J. Rumbaugh, I. Jacobson, The Unified Modeling Language User Guide^ 
Addison- Wesley, 1999. 48, 49 

3. F. Buschmann, R. Meunier, H. Rohnert, P. Sommerland, M. Stal, Pattern- Oriented 
Software Architecture: A System of Patterns^ Wiley Computer Publishing, 1996 48, 
49, 50 

4. G. Franks, A. Hubbard, S. Majumdar, D. Petriu, J. Rolia, C. M. Woodside, ”A 
toolset for Performance Engineering and Software Design of Client-Server Systems”, 
Performance Evaluation, Vol. 24, Nb. 1-2, pp. 117-135, November 1995. 51, 58, 60 

5. J. Dilley, R. Friedich, T. Jin, J. Rolia, ”Measuremnt Tool and Modelling Techniques 
for Evaluating Web Server Performance” in Lectures Notes in Computer Science, 
vol. 1245, Springer, pp. 155-168, R. Marie, B. Plateau, M. Calzarosa, G. Rubino 
(eds), Proc. of 9-th Int. Conference on Modelling Techniques and Tools for Perfor- 
mance Evaluation, June 1997. 49 

6. J. E.Neilson, C. M. Woodside, D. Petriu, and S. Majumdar, ’’Software bottlenecking 
in client-server systems and rendezvous networks”, IEEE Transactions on Software 
Engineering, vol. 21(19) pp. 776-782, September 1995. 52 

7. Object Management Group, The Common Object Request Broker: Architecture and 
Specification^ Object Management Group and X/Open, Framingham, MA and Read- 
ing Berkshire UK, 1992. 51 

8. D. Petriu, X. Wang, ’’Deriving Software Performance Models from Architectural 
Patterns by Graph Transformations” , Proc. of the Sixth International Workshop on 
Theory and Applications of Graph Transformations TAGT’98, Paderborn, Germany, 
Nov. 1998. 48 

9. J. A. Rolia, K. C. Sevcik, ’’The Method of Layers”, IEEE Trans. On Software 
Engineering, Vol. 21, Nb. 8, pp. 689-700, August 1995. 48, 51 

10. A. Schiirr, ’’Introduction to PROGRES, an attribute graph grammar based specifi- 
cation language”, in Graph-Theoretic Concepts in Computer Science, M. Nagl (ed), 
Vol. 411 of Lecture Notes in Computer Science, pp. 151-165, 1990. 55 

11. A. Schiirr, ’’PROGRES: A Visual Language and Environment for PROgramming 
with Graph Rewrite Systems”, Technical Report AIB 94-11, RWTH Aachen, Ger- 
many, 1994. 55 

12. A. Schiirr, ’’Programmed Graph Replacement Systems”, in Handbook of Graph 
Grammars and Computing by Graph Transformation, G. Rozenberg (ed), pp. 479- 
546, 1997. 55 

13. M. Shaw, D. Garlan, Software Architectures: Perspectives on an Emerging Disci- 
pline, Prentice Hall, 1996. 

14. M. Shaw, ’’Some Patterns for Software Architecture” in Pattern Languages of 
Program Design 2 (J.Vlissides, J. Coplien, and N. Kerth eds.), pp. 255-269, Addison 
Wesley, 1996. 48, 49, 50 

15. C. U. Smith, Performance Engineering of Software Systems, Addison Wesley, 1990. 
47 

16. C. Shousha, D. C. Petriu, A. Jalnapurkar, K. Ngo, ’’Applying Performance Mod- 
elling to a Telecommunication System” , Proceedings of the First International Work- 
shop on Software and Performance, Santa Fe, USA, pp. 1-6, Oct. 1998. 49, 59, 60, 
61 



From UML Descriptions to LQN Performance Models 



63 



17. C. M. Woodside. ” Throughput Calculation for Basic Stochastic Rendezvous Net- 
works”. Performance Evaluation, vol.9(2), pp. 143-160, April 1988 48, 51 

18. C. M. Woodside, J. E. Neilson, D. C. Petriu, S. Majumdar, ”The Stochastic 
Rendezvous Network Model for Performance of Synchronous Client- Server- like Dis- 
tributed Software”, IEEE Transactions on Computers, Vol.44, Nb.l, pp. 20-34, Jan- 
uary 1995. 48, 51 



On a Uniform Representation of Transformation 

Systems * 



Paolo Bottom, Francesco Parisi-Presicce, Marta Simeoni 

Department of Computer Science, University of Rome La Sapienza 
Via Salaria 113, 00198 Roma (Italy) e-mail: bottoni/parisi/simeoni@dsi.uniromal.it 



Abstract* We discuss an intermediate language to represent transitions 
defining behaviours of autonomous agents. The language allows a uni- 
form representation of several diagrammatic languages for specification of 
reactive systems, based on an underlying notion of transition. The trans- 
lation of graph transformations to this language opens an opportunity 
for a notion of communication between agents represented by graphs. 



1 Introduction 

Languages for specification of reactive systems have largely used rewriting sys- 
tems to define the behaviour of agents or of classes of agents. Examples are Linear 
Objects [2] and Gamma [3]. On the other hand, languages for representation of 
characteristics of agents and of their forms of communication have generally ex- 
ploited declarative tools, with a strong flavour of languages developed in the 
Artificial Intelligence field, generally based on Lisp or Prolog, such as KIF [14]. 

The languages of the first type are more suited for the definition of autonomous 
agents with coordination abilities, but lack expressivity w\r.t. the specific goals 
of communication; languages of the second type are more dependent on a client- 
server paradigm for communication among agents. An intermediate language 
could allow the expression of autonomous behaviours of agents, of different 
forms of coordination and communication, and be easily translatable in spec- 
ification languages of the different types. This becomes particularly important 
in the specification of open systems, where fe^w assumptions can be made on 
the inner characteristics of agents, but a common language is needeed to allow 
communication among them. This requires the identification of components of 
the transitions which define an agent behaviour, as well as of the infrastructures 
defining the forms of coordination and communication among agents. 

Graph transformations [18] have been used primarily as a means for specifying 
transformations of a distributed system, less so to specify the behaviour of in- 
dividual agents which have to coordinate themselves by exchanging information 
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through some communication infrastructure. Rather, they have been used to 
describe such an infrastructure and its possible evolution. 

We explore here the possibility of introducing an explicit notion of communica- 
tion in graph transformation, as communication of graph transformations, and 
discuss it in the light of a proposed intermediate language for the specification 
of reactive systems. This would make it possible to specify multi-agent systems 
with agents described by heterogeneous mechanisms. 

Paper outline. Section 2 discusses related work in the fields of specification 
of reactive agents and of distributed graph transformations. Section 3 presents 
the intermediate language WIPPOG, based on the identification of abstract 
components of a transformation, and the underlying model of reactive system. 
Section 4 shows how^ graph transformations can be translated into WIPPOG and 
presents a model for explicit communication for reactive systems defined through 
graph transformations, enriching the translation into WIPPOG with the relative 
constructs. Section 5 sketches applications and Section 6 draws some conclusion. 



2 Related work 

Several languages for specification of open or concurrent systems use primitives 
for asynchronous communication, such as messages in Actors [1], the prefix in 
LO [2], or the out operator in Linda [7]. On the other hand, protocols are defined 
to express how communication is managed by the receiving agents (e.g. execution 
of messages in mailboxes -Actors, internalisation of transmitted resources -LO, 
or active reading after pattern matching through in and rd operators -Linda). 

Other languages express synchronous communication by specifying the concur- 
rent transformation of different agents involved in a coordination pattern, as in 
the Maude language via rewriting rules [16]. 

Models of both types have been applied in the graph transformation field. The 
Actors model has given rise to Actor Grammars, in which the state of a dis- 
tributed system is represented by a graph, with nodes denoting actors and mes- 
sages, and edges representing actors’ acquaintances or message destinations [15]. 

Synchronous models define conditions under which several graph transformations 
can occur at different parts of a graph. For example, in [8] agents are sets of 
productions w^hich coordinate themselves by transforming different subgraphs 
with a common context . 

In a series of papers [19], [13], Taentzer explores distributed graph rewriting as 
resulting by a hierarchical view of a distributed system seen as a graph describ- 
ing a configuration/communication network, whose nodes contain local graphs 
describing the state of individual agents. An agent embodies its interface to other 
agents, so that asynchronous communication consists of applying a rule which 
modifies the content of this interface, and synchronous communication consists 
of the simultaneous application of local rules to different local graphs. 
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In this paper, we deal with a notion of agent as a set of resources encapsulated 
together with a set of abilities to consume and produce resources. In this view 
a message is a resource made available to other agents, or received from other 
agents, which ma\^ give rise to transformations inside the agent. Hence, we are 
interested in completely local transformations, as specified by a transition system 
which defines the abilities of the agent. Moreover, productions are seen in turn 
as resources that are always a\^ilable, but which can be transformed under the 
effect of received messages. Therefore, we do not mix transformations of the static 
configuration structure and of the local state of an agent, as in Actor Grammars, 
nor do we impose an}^ type of synchronisation on transformations - which might 
however arise from local protocols established by groups of agents, nor do we 
embed the communication protocol in the application of the transformation, as 
in Taentzer’s approach. 



3 An intermediate specification language for reactive 
systems 

The set of possible behaviours of reactive systems is often defined via production 
rules that describe the transformations that an agent undergoes or produces in 
its environment when certain situations occur. 

Such production rules are generally expressed in the form of before/after rules, 
which specify pre- and post-conditions of the transformation. Syntactic differ- 
ences however, make it difficult to identify the common aspects of different for- 
malisms, Consider for example Finite State Automata, where transformations 
(state transitions) are triggered by external inputs, and Petri nets, where trans- 
formations (transitions or events) occur when a certain submarking is present 
in the net. This situation makes it difficult to bring agents specified in different 
forms to interoperate in a heterogeneous distributed environment. 

Problems of interoperability are usually confronted by introducing intermediate 
languages, such as IDL for CORBA. Here we propose an intermediate language 
suited to expressing transitions defining the possible behaviours of an agent. 

The proposal is based on an analysis of what may appear in the pre- and post- 
conditions of a transition. Namely, a pre-condition may require that some prop- 
erties hold for some component of the agent’s state {internal trigger), that a 
certain input must be received by the agent {external trigger) and that addi- 
tional conditions ai‘e met in the agent’s state {internal constraints). Symmetri- 
cally, post-conditions may require that, after application, some properties hold 
in the agent’s state {internal consequences), some outputs are emitted in the 
agents’ environment {external consequences), and some computational activities 
are performed or started b^^ the agent {pragmatics). Following this analysis, we 
propose to describe a transition by specifying the following components 

- WHEN: indicates the internal trigger, i.e. the preconditions that must hold 
in the agent; 
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- GETS: indicates the external trigger, i.e stimuli from the environment; 

- IF indicates the internal constraints, i.e. a predicate which must be satisfied 
by the agent’s properties; 

- PRODUCES: indicates the internal consequences, i.e. the postconditions 
that the transition brings to hold in the agent; 

- OUTS indicates the external consequences, i.e. the possible output that the 
agent emits towards the outside; 

- PROCESSES indicates the pragmatics, i.e. possible actions performed in- 
side the agent (including those to evaluate parameters for the PRODUCES 
and OUTS components). 

We call the intermediate language WIPPOG, from the initials of the six com- 
ponents. In the following we wdll show only components which ai’e not empty 
in a transition. Consider, for instance, the following transition in a generalised 
finite state machine asserting that when the automaton is in the state ” 1” and 
receives the input ”a”, it goes in the state ”4”, issuing the output ”b”. 




This transition is translated in the following representation: 

- WHEN: 

- GETS: ”ms^(a)” 

- PRODUCES: 

- OVTS’:^msg(by 

Note that messages are encapsulated in . This allows the agent to distin- 

guish between resources internally produced and those received from the outside. 
If we want to indicate that resources can also be internally produced, ”a” and 
”b” will go in the WHEN and PRODUCES components respectively. 

In an analogous way, a representation for Petri Nets will indicate places by 
''^place{Name^TokenNumbery\ Thus, a transition consuming a token from a 
place ”1” and two tokens from a place ”4”, and putting a token in a place ”7”, 
(see Figure 1) becomes: 

- WHEN: ’’p/ttce(l,X)’’’’p/ace(4,r)””P^«'Ce(7,Z)’’ 

- IF: ”X > l””r > 2” 

- PROCESSES: ”X1 = X - r”yi = y - 2””Z1 = Z + 1” 

- PRODUCES: ”pZace(l,Xl)””p/ace(4,yi)””p/ace(7,.^l)” 

The WIPPOG representation could allow us to extend the models of finite state 
automata and Petri nets, by expressing additional conditions in the otherwise 
empty components. Thus for instance, the IF and PROCESSES components 
in the specification of FSA could be used to express transitions in hybrid systems, 
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where attributes are attached to states and transitions can be conditioned on 
the value of these attributes. On the other hand, the addition of GETS and 
OUTS components could be used to simplify the expression of user interaction, 
when Petri nets are used to control dialog in an interactive system [4] . 




Fig. 1. A transition in a Petri Net 



We now give a more complete example, using a graph transformation system to 
play tic-tac-toe, showing translations in WIPPOG beside the rules. The example 
illustrates a case of communication through access to a common support. 

Example 1 (Tic-tac-toe). Consider tw^o agents, Agentl and Agent2^ playing tic- 
tac-toe with each other on a common gameboard, with initial state: 




( turn ] 1 



(^counter J 0 



where each line connecting two cells of the board denotes their proximity and 
stands for two directed arrovrs going in opposite directions. Each cell has two 
attributes (not drawn in the picture) denoting its position in the board. The 
node labeled htrn has an attribute denoting which agent is currently playing: 
Agentl is the one beginning the game. The special value 3 for this attribute 
is reserved for the board-cleaning activit}^. The node labeled counter has an 
attribute counting how many moves have been made since the beginning of each 
game. Its value will be tested if the table is completely filled with X and 0, but 
there is no winner for the current game. 

The agent’s rules act only on the attribute values and not on the graph structure: 
they are represented by giving only left-hand and right-hand sides and not the 
interface components of rules (which specify that the structure is preserved). 

Agentl has nine move rules (one for each possible instantiation of i and j), like 
the following one. Agent2 has similar rules, but conditioned on the value 2 for the 
attribute turn. It would be possible to have rules describing a winning strategy 
for Agentl, but for this example we are only interested in showing coordination 
betw^een agents. 
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WHEN: "tum(l)" ’'counter(N)" "place(I,.T,' ’BLANK”)" 
PROCESSES: ’{M=N+1}" 

PR0DUC:ES: "tura(2)" ’'counter(M)" ’place(I,J,”X”)" 



( turn ] 1 
(^counter j n 



□ 



0 



( turn ) 2 
(^counter ^ n+1 



Agent! also possesses 8 rules as the one on the left side of the following figure, 
to check if it has won the current game (to be applied on rows, columns and 
diagonals) : 




( aim ) 2 
(counter J n 






imn 


) 3 






counter 


) 0 



WHEN: ’'turn(2)" "coutiter(N)" "placed,!,” X”)" 
"place(I,K,”X”)" "place(I,L,”X’ ’ )" 

IF: “K = J+ 1” “L = J + 2” 

PRODUCES: "turn(3)" "counter(())" "place(I,.T,” BLANK”)’ 

"placed, K, ’’BLANK”) ' ’'place(I,L,”BLANK”)’ 



If the table is completely filled with X and O, Agent! uses the following rule to 
start the cleaning table phase: 



( Uirn 1 
[counter ] 9 



[ turn 3 
[counter ] 0 



WHEN: ■’turn(l)" "counter(9)" 
PRODUCES: ’turnO)" "counter(O)" 



The rule sets the turn attribute to 3 to start the cleaning phase, and resets the 
counter attribute value. 

To clean the table, Agent! uses the nine rules of the following type: 



tj 




c 



□ 

( [ turn [ ) 3 



WHEN; "turn(3)" "placed, J,”X”)" 
PRODUCES; "turn(3)" "place(I,J,” BLANK”)" 



Note that only cells filled with X are cleaned by Agent!. Agent2^s rules differ 
from those of Agent! just for the use of O instead of X and the complementary 
values of the turn attribute. Finally, both agents have the following rule which 
allows Agent! to start a new game. 



[ turn ) 3 [ turn ^ 1 




WHEN: "turn(3)" "placef 1,1, ’’BLANK”)" ... 

"place(3,3,” BLANK”)" 

PRODUCES: "turn(l)" "placef 1,1,” BLANK”) 
"place(3,3,’ ’ BLANK' ’ )" 
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By the examples above, we see that w^e regard the state of the agent as a col- 
lection of typed resources, so that an agent’s transition causes the production 
or consumption of resources. Resources are considered to be consumed inter- 
nally to the agent. The explicit representation, by the components GETS and 
OUTS, of communication with the outside, encapsulated in special resources 
of type msg supports a view^ of communication as the transmission of resources 
to be consumed in the receiving agents. Mechanisms of internalisation can thus 
be devised by which the message content is extracted and made available to the 
agent, regardless of its origin. Such internalisation rules have typical forms 

- GETS: ^^msg{XY 

- PROCESSES: = translate^Xf 

- PRODUCES: ’T” 

where X is some resource and Y is a translation in the agent’s internal language. 

The class of transformations specified by WIPPOG includes rewriting systems in 
which transitions occur by substituting elements for elements in a way completely 
internal to the agent and without side effects. The rules themselves can be seen as 
resources of a special type, in particular they are assumed never to be consumed, 
as is typical of the LO model. By this view^, we open the possibility of embodying 
the rules available to the agent in its state, but also to express the same behaviour 
of an agent as a WIPPOG rule, typically described as: 

- WHEN: ^Wule{W : WIJ: II, P : P1,P : P2, 0 : 01, G : Gif 

- IF: ”eua/(71)” 

- GETS:”GU’ 

- PROCESSES: ”eua7(Pl)” 

- PRODUCES: ”P2” :Wl,I : II, P : P1,P : P2, O : Ol,G : Gif 

- OUTS: 

This opens the possibility of defining metalevel actions by which the set of avail- 
able productions is inspected and manipulated, but also of reflective actions, by 
which an agent transforms the law of application of productions. For example, 
one could transmit the set of transformations of an agent together with a WIP- 
POG specification embedded in a special resource behaviour, which, once read, 
could define the operational semantics for the transmitted rules. 

In the next section, we will consider a specific form of such transformations, for 
the case of Graph Transformation Systems. 



4 Graph transformations with communication 

A graph transformation system specifies the evolution of the structure of a dis- 
tributed s\^stem, where nodes represent components of the system and edges 
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define the structure of the system, thus placing constraints on the possible si- 
multaneous modifications of components. 

We are interested in notions of graph transformations that comply with the 
requirements for reactive systems discussed in the previous section; in particular 
locality of the transformation and total specification of a transformation within 
a single rule. This rules out transformations based on algorithmic approaches 
[10], and leads us to the algebraic approach [11], [12]. Within this, we turn to 
the DPO approach, since w^e do not w^ant side-effects, such as edge removal, not 
specified by the rule. Moreover, the model of transformation based on resource 
consumption and production that we have chosen as the underlying semantics for 
our intermediate language matches well with the gluing view of graph rewriting 
in the DPO approach. For this reason, we restrict the treatment to agents whose 
behaviour, as defined above, is not allowed to change. While we refer to literature 
for a description of the DPO approach, w^e only state here that we admit the use 
of labelled and attributed graphs. 

We consider agents defined by Graph Transformation Systems. An agent is a 
pair ag = {g, P), where ^ is a graph, also called the agent’s state graph, and P is 
a set of productions. Productions in P can be applied to the state graph of the 
agent. These productions can also be represented in the intermediate language 
by identifying the substitution of the left side L by the right side P, preserving 
the interface AT, the transition defined by the production. 

To this end, we assume a representation of graphs and graph transformations ex- 
ploiting terms of type node(X) and edge{Z^ [X, V]), where X, Y and Z represent 
suitable identifiers allowing the elements to be distinguished. An edge is iden- 
tified b}^ its own identifier and contains the identifiers of the nodes it connects. 
Additional attributes could be added, by extending the arity of the constructors 
node and edge. Hence, a production L ^ K ^ R is represented by listing the 
nodes and edges in the three components, and the gluing condition is enforced 
by equality of identifiers. In this case, we see translation to WIPPOG as the 
generalisation of what shown in Example 1 : 



- WHEN: ''node(Xf . 

- PRODUCES: "node(X^)" 



node{X^J' "node{X^-^r . . . "node{X^l^J' 



■rriK > 

K 



"edge{Zf,[Xf_„X(^^^]) 
"node{X^-^^” 



"nodeiX^^J 



ff. 



... ”nade{X^-J:fJ' 






-yR—K ]\// 
^nn-K, 2 \) 



where the apices L, X, and R indicate the components of the rule. 

The WIPPOG model provides limited support for negative application condi- 
tions. Actually, it is not possible to state absence of subgraphs, but only to 
state that some attributes of nodes mentioned in L or X must or must not have 
some values. With a representation of nodes of type 7iode{X,List), where List 
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contains a list of the edges attached to the nodes, it is also possible to express 
absence of edges between two nodes in the left-hand side of the production. 

An agent is able to communicate with others by using special productions in 
its set P which allow sending messages among agents, upon the execution of 
specific transformations. Hence, we enrich graph transformation systems with 
a notion of explicit message in the following way. Let p = (pL ^ Pk — ^ Pr) 
be a DPO production. An expori production is a pair (p^exp) where exp is a 
DPO production of the form (Gl^Pl) ^ (Gk^Pk) — ^ (Gr,Pr)^ such that 
Gl ^ Gk Gr is a production modifying the state graph of the agent, and 
Pl ^ Pk Pr is Sl production modifying the set of productions available to 
the agent, i.e. Pl^Pk^ and Pr are sets of productions, as in [17]. 

The WIPPOG representation for these productions, is 

- WHEN: ’W” 

- PRODUCES: 

- OUTS: 

where Wexpo Wexpr are WIPPOG representations of the two exp compo- 
nents. 

According to the model presented in the previous section, messages in exp ai’e 
regarded as resources made available to the receiving agents and which can be 
consumed inside them. In particular, since these messages contain coding of rules, 
they define a new possibility of transformation of the agent’s state graph or of its 
production set. Such a transformation may occur only once, upon consumption 
of the message. When a message is executed in the agent w^hich receives it, the 
message can produce either a modification of the agent’s state graph, or of the 
agents’ repertoire of rules. 

A particular case is that in which exp only amounts to 0 ^ 0 — ^ Gj?. If this 
rule is executed in a receiving agent with state graph G, the DPO construction 
produces as result the coproduct G -h Gr, i.e. the disjoint union of the sets of 
nodes and edges in G and Gr^ respectively. The two graphs can then be merged 
by the application of suitable rules. Analogously, a message of type % ^ ^ ^ Pr 
would simply add the set of productions Pr to the production set of the agent. 

When the agent’s state graph g has a match for the rule Gl ^ Gk Gr, then 
the rule will be executable. Similarly, when the agent’s repertoire has a match 
for Pl <r- Pk Pr and the rule is executed, the set P will be transformed 
accordingly. However, the two rules in exp are not added to the set P, since 
this would make them permanently available to the agent, and w^ould expose the 
agent to external threats, if it has no control on execution of productions from 
outside. Hence, we assume that a protocol is defined on communication, such 
that messages are first placed by the agent in an “import” interface component. 
If the content of messages is safe, it is removed from the import area and brought 
by suitable internalisation rules to a ’’quarantine” component. Productions in 
quarantine can be executed at any instant on the state graph or the production 
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set, but once executed axe removed from the agent. Then, while reception of a 
message can be modelled as instantaneous, the execution of the rules it contains 
occurs as}mchronously when the agent’s state (graph and repertoire) allows so. 

While the part of the protocol delivering messages to agents is the responsibility 
of the communication infrastructure, the transfer from import to quarantine and 
the execution of productions in quarantine are the responsibility of the agent 
itself. We now enrich the definition of the agent to be ag = (p, where i 

and q describe the state of import and quarantine respectively (see Figure 2). 
These transfers can now be specified as the transitions: 

- GETS: ^^msg{Zg,ZpY 

- IF: ^^safe{Zg,Zp) 

- PRODUCES: ^^quar{Zgf ^^quar{ZpY 

to bring the rule to quarantine, assuming that the GETS component looks for 
resources of type in the agent’s import area, and 

- WHEN: ”(L - K)Y quart L, i- K, i?,)” 

- PRODUCES: ^^Kg^ ”(P - i^)p” 

to execute it from the quarantine and remove it. Here (L — K)g, Kg and {R — 
K)g are shorthands for the representation of the production described above. 
Analogous transformations define the modification of the production set P. The 
productions in the messages could in turn be export productions, for example 
enabling the receiving agent to send back an answer. 




Fig. 2. Schematisation of an agent’s internal structure. Arrows indicate reading (r) or 
writing (w) of data in the different components. 



The proposed communication mechanism leaves the agent permeatable to exter- 
nal communication. However, this does not hinder agent’s security, if each agent 
can use a private alphabet of labels and a common alphabet for communication. 
Internalisation rules can make the imported rules available for use in the agent. 

Let Ci be a set of labels private to agent agi and Che ^ set of labels such that 
each agent is equipped with a mapping mi : C ^ Ci. Then labels in msg are 
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elements from C and the agent’s internalisation mechanism performs relabeling 
via a graph transformation system with productions as those in Figure 3. 



a 



a 



a 




Fig* 3* Example of relabeling rules for internalisation 



In this way, an agent may subordinate the application of a rule received from 
outside to its relabeling, and execute it safely. Moreover, internalisation rules 
might be enriched to specify that the rules to be brought into quarantine have 
a given form or are in some relation with the other rules. 

The agents might also be endowed with purge rules that discard a message from 
the import^ so that it cannot be applied. 

The export and internalisation mechanisms allow forms of generative commu- 
nication, in which an agent recognises the possibility of using messages, as well 
as more typical message passing mechanisms, where a message provokes the ex- 
ecution of some activity in the receiver. In any case, the proposed mechanisms 
do not rely on any specific model of communication or require specific forms of 
implementation . 

Example 1 referred to a situation w^here both agents had access to the same 
gameboard. The possibility to communicate rules allows us to propose a dis- 
tributed specification of the tic-tac-toe game, where each agent has a copy of 
the gameboard and communicates its moves to the other. 

Basically, each production specifying a move in Example 1 is augmented with an 
export part which replicates the executed move, so that agents can coordinate 
their actions to maintain a consistent view on individual states in cases where 
they cannot share a common support. 

The only exceptions are the moves checking whether an agent has won the cur- 
rent game, where Agent! only needs to transmit the following production 



- WHEN “turn(l)” 

- PRODUCES “turn(3)” 



to notify to the other agent that it has won (and the analogous export for 
Agent2), and the final rule that is independently executed in both agents when 
the state depicted in its WHEN component is reached. 
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5 Applications 

We sketch two applications of the proposed communication mechanism. The 
first consists of the realization of a Consumer-Producer protocol either in a 
synchronous or asynchronous way* The second tackles the problem of dynamic 
change, faced when modification in an execution policy must not disrupt activi- 
ties, executing under the old policy, which were already started w^hen the change 
occurred, while guaranteeing that new^ activities follow the new policy. 



5.1 Producer— Consumer protocol 

Consider tw^o agents playing the role of a producer and a consumer, and the 
following stack buflfer used to store the produced items: 






(freb ) 



The producer can insert an item if the buffer is not full, while the consumer can 
remove an item if the buffer is not empty. The pointer ” P” points to the first free 
cell of the stack. The use of the buffer is exclusive: the attribute values FREE ^ 
BusyP and BusyC of the oval node beside the buffer have to be correctly set 
and checked to guarantee the exclusive access. 

This protocol can be modeled by three agents coordinating their actions: Con- 
troller - Producer - Consumer, The task of the Controller agent is to non de- 
terministically^ assign the use of the buffer to the Producer or Consumer agents. 
It has the following two productions: 



(frf.h ) 



(BusyP 



(BusyC 



3 ( frff. ) 



^ (frbb ] 



( BusyP ] 



(busvC ) 



The left side of the box contains the local rule while the right side contains the 
exported rule. The interface component of the rules is not showm. 

The Producer has the following three productions to insert the produced items 
into the buffer: the first production inserts an item in any intermediate cell, while 
the second one is used to insert an item in the last free cell. The last production 
changes the lock value from BusyP to FREE when the buffer is already full. 
The Consumer productions are sy^mmetric w.r.t the Producer ones. 
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This solution is quite similar to the one proposed for the Tic-tac-toe game: the 
agents maintain a consistent view of the common state (i.e the buffer and the 
node controlling exclusivity) by being synchronized on each single change* 

On the other hand, looking at the producer, consumer and buffer as indepen- 
dent devices (agents), it is possible to model the same protocol in a completely 
asynchronous way. We propose directly a WIPPOG solution where: 

- The producer has a single production for producing an item and sending it 
to the buffer: 

(under some arbitrary precondition) 

PRODUCES: product(x) 

OUTS: write(x) 

- The consumer receives the items via messages from the buffer, and uses them 
in some way: 

GETS: read(x) 

IF: safe{x) 

PRODUCES: quar{x) 

- The buffer deals with insertions of items w.r.t the messages received from 
the producer, and deletion of items sent to the consumer: 

WHEN: position{i)^ content{i^x^) 

GETS: write{x) 

IF: notLast(i), isNull(x') 

PROCESSES: j = i + l 
PRODUCES: position{j), content{i,x) 
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WHEN: position{i)^ content{i ^ x^) 

GETS: write{x) 

IF: islast{i)^ isNull{x^) 

PRODUCES: position{;i) ^ content(i,x) 

WHEN: position{i)^ content{i^x) 

IF: isNotFirst{i) ^ notNuU{x) 

PROCESSES: j = i-l,z = ^l 
PRODUCES: position{j)^ content{i^z) 
OUTS: read{x) 

WHEN: position{i)^ content{i^x) 

IF: isFirst^i)^ notNull{x) 

PROCESSES: j = BUFFER.LASTINDEX, 
PRODUCES: position{j)^ content{i, z) 
OUTS: read(x) 



5.2 Dynamic change 



We give here a brief sketch of the solution to the problem of dynamic change in 
the case where agents follow partial orders for rule application in realising a given 
policy. Such an order can be realised by having special nodes which maintain 
a toDo list and others which maintain the currently enabled rules. We do not 
discuss here the enabling mechanism, see [5] for an implementation in LO. If 
we allow rules w^hich change the set of productions while a certain sequence of 
applications has been started, w^e run the risk that an agent neither can progress 
in the original policy, nor can it apply the new policy, for instance because some 
resources have already been consumed by the old policy. To avoid this problem, 
we assume that sets of rules are associated with theory identifiers and that 
policies are enforced by an exec Accord resource. The rules of transformation sent 
by an agent which requires a policy change only add to the rule base (without 
deleting the old rules). The message also specifies that a special node labelled 
chgdThry - containing the specification of the part of plan to modify and the 
name of the new theory - must be added to the graph and that the value of the 
attribute for a node labelled susp must be switched from false to true (regular 
processing is guarded by having susp (false)). For simplicity, we use labels as 
identifiers and variables and constants to indicate the values of attributes. The 
agent can thus start inspecting the rule base to detect Avhich rules are to be 
disabled in the current plan and which rules have to be enabled in substitution. 
This is modelled by assuming the presence of a procedure, indicated by lookDiff^ 
which compares the rules associated with the new and old theories. A predicate 
present assesses whether the toDo list contains the elements to be eliminated. 
Once the agent has completed the replacement process, it reinstalls the node 
susp(false) and eliminates temporary nodes. 
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The process is described by the following WIPPOG specification. An implemen- 
tation in LO, in a case where all plans axe centralised in a single manager, is 
described in [5] . 

1. - WHEN: ^^susp{truef chgdThry{PlanPt,NThf 

'Wulehase{Py^ exec Accor d{PlanPt^ 

- PROCESSES: ^^[ToErs.ToIns] = lookDif f{P,PlanPt,NTh,OThf 

- PRODUCES: correctPlaniToErs.ToInsY ^Wulebase^Pf 

execAccord{PlanPart^ NewThy^ 

2. - WHEN: correctPlan{ToErs,ToInsy^ ^HoDo{RlsY 
— IF: present{ToErs^RlsY‘ 

- PRODUCES: ‘WemoveFromPlan{ToErsY 'HoDo{RlsY 
” addT oPlaniToInsy^ 

3. - WHEN: correctPlan{ToErs,ToInsY ^HoDo{RlsY 

- IF: ^'not{present{ToErs,Rls)Y 

- PRODUCES: 'HoDo{RlsY susp{falseY 

4. - WHEN: ^WemoveFTomPlan{ToErsY 'HoDo{RlsY 

- PROCESSES: LeftRules = erase(Rl$^ToErsy^ 

- PRODUCES: 'HoDo{LeftRulesy'' removed{truey^ 

5. - WHEN: ^^addToPlan{ToInsY ^HoDo{RlsY 

- PROCESSES: NeivRules — insert^Rls^ToIns)'^ 

- PRODUCES: 'HoDo[NewRulesY 

6. - WHEN: added(trueY ”remoi;ed(7mc)” 

- PRODUCES: 'Uusp[falseY 



6 Conclusions 

The paper has presented an intermediate language, called WIPPOG, for speci- 
fying transformation S 3 ^stems and has shown how it can be used to specify graph 
transformation systems based on DPO. This allows the introduction of a notion 
of communication of productions among agents specified via GTS. 

WIPPOG has been primarily developed to allow a uniform representation of 
different visual formalisms [6]. It is based on a model Avhich sees transformations 
as production and consumptions of resources, possibly guarded by special events 
or conditions. The proposed model allows a uniform management of aspects of 
metalevel and of mobility. Indeed, an agent A can represent a meta-agent for an- 
other agent 5, by sending B messages which make 5’s repertoire of productions 
change. Reflective actions can be performed by having an agent send messages 
to itself to force transformation of its owm productions. On the other hand, a 
message of type ( 0 , 0 ) ^ ( 0 , 0 ) {Gr,Pr) can be seen as the transmission of 
an agent with state Gr and production set Pr to some remote agent in which 
it can start operate. By a suitable relabeling of the elements in Gr and Pr^ the 
receiving agent can isolate the execution of this new agent in a safe environment. 
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Finally, the use of an intermediate representation makes it possible to transmit 
code specifying transformations over a network. Such a code can be either trans- 
lated to some executable code, or be directly executed if the receiving platform 
has an interpreter for WIPPOG. 
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Abstract. Specifying and programming with the help of agent systems 
is gaining more and more interest, especially in the field of distributed 
and reactive systems. In this paper, we propose a formal model of agent 
system's based on graph transformation. It is quite natural to visualize 
systems, especially system states, by means of graphs. Hence, it is also 
appropriate to specify those systems using graphs. Accordingly, changes 
firom one state into another can be modeled by graph transformation. 



1 lotrodoction 

Agent systems are an upcoming framework in software engineering. They origi- 
nally stem from the research field of artificial intelligence. Marvin Minsky stated 
in his innovative book “The society of mind” (cf. [Min85]) that the human mind is 
built up of entities (neurons), each capable of a limited number of functionalities. 
He assumes that from their interaction and cooperation intelligence emerges. 
When first introducing agent systems, researchers tried to imitate and utilize 
this effect. Later on, it appeared that agent systems have lots of others and even 
more significant benefits: they are flexible and have a semantically well defined 
and therefore efficient communication. Additionally it seems to be a well suited 
paradigm for designing distributed, open, and reactive systems. Especially this 
becomes increasingly important concerning software products acting in global 
networks, like the internet. As networks, i.e., systems, are suitably represented 
by means of graphical representations, we will go on and specify agents and agent 
systems by graphs and the actions they perform by modifying those graphs sys- 
tematically by means of graph transformation. 

The paper is organized as follows: next we are going to introduce concepts 
of agent systems in Section 2 and of graph transformation in Section 3, which 
presents formalisms needed in the following. After that, in Section 4, we give 
a first notion of what an agent system is, this being defined formally later on. 
Graphs have the benefit that their semantics is the graph itself. Hence, an easy 
semantics of a graph based agent system is intuitive to define. This is done 
in Section 5. Afterwards we examine a larger example that demonstrates the 
usefulness of our approach as well as of agent systems in general. Finally, we 
draw some conclusions and identify work to be done in the future. 

M. Nagl, A. Schurr, and M. Munch (Eds.): AGTIVE’99, LNCS 1779, pp. 79-86, 2000. 
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2 Agent Systems 



Without getting too involved with the theory of agent systems, we can state that 
it is a system consisting of agents, i.e., computational entities that interact and 
communicate in a certain environment. For a far more elaborated introduction 
into this theory see [Bra97], [WJ94]. Agents allow modular descriptions of very 
complex, dynamic, and distributed systems. They are closely related to objects 
but concentrate more on communication than on typing. A main feature of these 
agent systems is the idea of delegation: An agent is delegated to fulfil a task, 
which it pursues subsequently. 

In this context several attributes that describe its kind of behavior can be 
assigned to the agent: 



autonomous 

reactive 

proactive 

cooperative 

rational 

intelligent 



once an agent gets its tasks, it is not controlled by other entities, 
agents react to changes in their environment, 
an agent is proactive, if it is able to initiate interactions with 
other agents on its own, 

agents work together to reach a common aim, 
agents behave rationally, i.e., they perform only actions they con- 
sider to be optimal in the respective situation, 
a fuzzy attribute meaning that an agent is capable of doing smart 
things like learning, reasoning, resolving conflicts, etc. 



Based on reactivity which is a main characteristic, each of these agents is 
capable of sensing the environment and reacting appropriately, always pursuing 
its specific goal. The changes made by agents can be recognized by other agents. 
This observable agents^ external behavior corresponds to a functional abstraction 
within the whole system because the agents^ internal functionalities can be far 
more complex. Hence, with the help of agent systems very complex systems 
can be modeled modularly. If the agents^ goals are chosen accordingly, problems 
can be solved concurrently. In this case we speak of cooperative agents. In some 
existing applications, especially in the field of electronic commerce, agents behave 
competitively. Electronic markets, supply chains, and automated negotiations 
are only a few examples of scenarios in which agent systems gain an increasing 
importance, because those systems are characterized by high dynamics within 
their component structure, by complex interaction between the participants, and 
they are naturally distributed. These systems^ characteristics are investigated in 
[Bra97] and [RM97]. When designing or investigating those systems, graphical 
notations are often used to facilitate the understanding of their structure and 
partially of their behavior. Therefore, it seems very suitable to use graphs as a 
means to specify the system^s structure and graph transformation for the system 
evolving. 

Up to now, many attempts in specifying agent systems have been made. They 
often use specification languages like CSP, Z, or modal logics in order to specify 
the essential agent components. Yet, there are not many attempts towards formal 
frameworks that cover all aspects of agenthood and make it possible to give a 
simple notion of what an agent system really is, i.e., what its syntax and its 
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semantics are. We are going to use mainly one formalism to build up an agent 
system. 

An agent system is investigated within an environment, which determines the 
basis of the problems the agents should solve and the part of the world the agents 
can act on and react within. As well as the system itself, the problem is modeled 
as a graph. Acting on a graph refers to perceiving the graph and changing 
parts of it. In our rule- and graph- based framework, agents are nothing else but 
entities like modules having local rules which they “try” to apply on a common 
graph, called the environment. Because of the rules being local, agents have some 
kind of privacy, concerning the encapsulation and the information hiding known 
from other software engineering concepts. The left-hand side of a rule “senses” 
the environment to assess whether a rule is applicable, i.e., the precondition is 
fulfilled, and if it is allowed w.r.t. the agent^s goal. This check is done using 
a special kind of control mechanism. Then the agent reacts and manipulates 
its environment according to the rulers right-hand side. The above mentioned 
control determines the agent^s goals, and the rules specify its capabilities. 

3 Graph Transformation 

Now, we are going to introduce a simple variant of the graph transforma- 
tion approach known from the literature as the double pushout approach (cf. 
[CEH+97]). In this paper we only use undirected graphs with labeled edges. But 
the use of this approach does not mean a loss of generality. It is only used to 
make the examples easier and more comprehensible and could be enhanced or 
replaced anytime. 

A graph G is a construct G — (E, E, inc^ Ie)i where V is a finite set of nodeSj 
is a finite set of edgeSj inc is the incidence mapping which assigns two nodes 
to each edge, Ie ' E ^ U is 3. labeling mapping over an alphabet E. If it is not 
clear from the context which graph is meant, we write G = {Vq^Eg^ incGilEc) 
to distinguish the components. G is the set of all graphs. A graph G^ is called 
a subgraph of another graph G, denoted as G^ C G, if it consists of a subset 
of nodes Q Vqj 3. subset of edges Eq^ C incidences and labelings of G 
and G^ coincide on and Eq^^ respectively. G^ is a proper subgraph of G if 
additionally yh Vq or Eq^ yh Eq is true. 

A subgraph G^ C G is an occurrence of a graph G^^ in G if G^ is obtained 
from G'' by renaming nodes and edges (where different nodes in G'' may become 
identical in G'). A rule r is a triple r = (L, R^M) where the left-hand side L and 
the right-hand side R are graphs as defined above with Ve C The component 
M is a set of graphs such that for each M € W, L is a proper subgraph of M. It 
serves as negative context condition in the sense of [HIIT96] and is introduced 
in order to express a situation, when a graph rule must not be applied. It is 
used in the following way: if M e M and V is an occurrence of L in G then an 
occurrence of of M in G with V C is forbidden. 

A rule with negative context condition is applied to graph G in four steps: 
First, find an occurrence V of L in G. Second, check the negative context condi- 
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tion. Third, remove the edges of V . And finally, add the edges of R and the extra 
nodes of R which are not in L where the incident nodes are renamed according 
to the renaming between L and V . 

Because we assume Vl ^ Vr^ we do not have the problem of dangling edges. 
Since rules are not allowed to erase nodes incidences of preserved edges are 
preserved. We say that G directly derives the resulting graph by applying 
the rule r and denote this by G =>r Gb Note that the derivation becomes 
non-deterministic if there are two or more occurrences of L in G or if different 
rules can be applied at the same time. Therefore, we often need to control the 
derivation process. If we are able to do so we can speak of programming by graph 
transformation as it is done in [Bun79] and [SchOO], for instance. 



4 Graph Transformation Based Agent Systems 



As we mentioned before, an agent can be viewed as computational entity having 
its own goals and capabilities. In order to reach its goals it must be able to 
perform actions in a sensible or rational way, i.e., it has some kind of plan it tries 
to fulfil. Therefore, we define an agent as a pair agent — (P^ C) where F is a set 
of rules and G is a capability condition determining the order in which rules have 
to be applied. What is G then made of? Some work has been done in this context, 
especially compare [Kus98]. The simplest choice are regular expressions over rule 
names that allow to require certain sequential compositions, alternatives, and 
iterations. The capability condition is kind of an interface to a more elaborated 
control, realizing the agents^ attributes like proactive^ cooperative j intelligent j 
and so on. 

In our framework, an agent has a relational semantics SEM (agent) C Q x Q. 
It takes an environment graph and transforms it into another, i.e., it yields a 
binary relation on graphs. Under this respect, the semantics is comparable to the 
one of transformation units or modules known from [KK99] and [HIIKK98]. But 
here we demand a different kind of control condition. A rule application is not 
only allowed by the capability condition, it is prescribed. Using the definition of 
agents, we define an agent system as a set of agents associated with an initial 
environment. As a first simple semantics of an agent system we can take the 
arbitrary sequential composition of the agents^ semantics. Hence, it again yields 
a binary relation on graphs. But this semantics does not take concurrency into 
account. However, it would be sufficient for the following example. 

Earlier in this paper we pointed out that communication is essential in agent 
systems. How does an agent talk to another? Here agents communicate by chang- 
ing the environment graph. This is a well studied concept known as blackboard 
communication, an asynchronous way to communicate. Hence, the environment 
serves as blackboard and the agents, always trying to apply their next rule, sense 
the blackboard and react according to the information and changes they find. 
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5 Semantics of Moltiagent Systems 

Semantics of reactive systems are not very easy to treat but frequently inves- 
tigated with the help of several models: transition systems [PI 08 I], Petri nets 
[Pet62], Hoare traces [Hoa85], and event structures [Win87] are the most impor- 
tant ones. They can be classified by several features. It is important to know 
whether they are able to model true concurrency or whether they use inter- 
leaving, i.e., they reduce parallel execution to a sequential execution in a non- 
deterministically chosen order. 

But how can we give a concurrent semantics? The idea all these models have 
in common is that they are based on atomic units of change. One of the most 
important questions is what these atomic entities are, i.e., what are events or 
processes that can be identified, and how can we define a notion of independent 
events which can happen concurrently. First, we have to distinguish between an 
action and an event. An action can occur many times in a systems run, an event 
is a unique instance of such an action. Because we use a rule-based framework, 
an atomic unit of change is of course the application of a rule. Here we do not 
consider problems like the overlapping of occurrences of the left-hand sides of 
the rules. This problem can be overcome by using graph abstraction. Agents who 
have those critical pairs of left-hand sides act on subgraphs exclusively. Another 
solution for this kind of problem could be communication. Agents who recognize 
conflicts communicate to resolve them. 

This consideration indicates that the work on independence, parallelism, and 
concurrency within the framework of graph transformation (see, e.g. [EKMR99]) 
can be advantageously employed in the analysis of agent systems which will be 
done in future research. Here, especially, the actor grammars defined in [Jan99] 
are of interest, because actor systems can be considered as a very early but 
limited form of agent systems. The 7 r-calculus is studied in [MPR99] and serves to 
model concurrent systems with communication, as well. Parallel and distributed 
graph transformation is investigated in [Tae96]. 

6 Ao Example 

In this section we investigate the Floyd- Warshall algorithm (see, [KK99] for a 
graph transformational description) for the computation of shortest paths and 
give a cooperative concurrent version of it: There are four different agents having 
different capabilities, i.e., different rules and different goals. Two of these agents 
are responsible for solving the problem. They cooperate to compute the results 
in a parallel fashion. The others insert nodes and edges and hence modify the 
problem. 

Concerning this kind of computation, not only the parallelism and the corre- 
sponding possible speed up are of interest. We also demonstrate another benefit: 
because the agents sense their environment, which represents a reactive system, 
they are able to notice changes. Concerning path length, as in road maps, for 
instance, changes are local but could have a global effect. In this example, we 
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want to allow that further edges and nodes are added by agents. Because the 
system does not terminate, agents do not have to compute all shortest paths by 
starting from the very beginning but just reprocess affected parts. 

The environment consists of a map, an undirected graph the edges of which 
are labeled with natural numbers. The four agents we have to define have the 
following goals: 

— The agent sum takes two subsequent edges and adds an edge of the length 
of both edges, if it does not exists yet. 

— The agent min compares two parallel edges and selects the shorter one. 

— The third agent add — edge adds new arbitrary edges between nodes. 

— The fourth agent add — node inserts new nodes. 



In Figure 1 those four agents are depicted in a pseudo code notation. 
sum 




In the capability conditions the exclamation mark stands for “as long as possi- 
ble” . Due to the system being reactive, agents try to apply their rules to reach 
their goals, as long as possible. 

Agent sum and agent min are cooperative agents. Both have only very limited 
capabilities. But they work together and calculate the shortest paths although 
each of them, on its own, could not achieve that. 

This agent system does not produce the optimal shortest paths between all 
nodes anytime because it could be the case that it is in a state of reprocessing, 
as a result of an inserted edge or node. But apart from that, it produces anytime 
acceptable approximations. 
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7 Conclusion and Future Work 

Although this graph- and rule- based approach is not very much elaborated up 
to now, it offers a lot of advantages, which have to be investigated thoroughly 
in the future: 

— A main advantage is that it is based on a well understood and investigated 
framework, namely graph transformations. 

- Different aspects of agenthood can be modeled within a single framework, e.g. 
agents, their components, communication, cooperation, several attributes, 
the environment. But as the capability condition is a kind of interface, even 
hybrid approaches are possible. 

— Additionally, it supports visual programming of agents, it is therefore intu- 
itive, even because relations between entities are made visible. 

- In contrast to other approaches, it has a well defined interleaving of reactive 
and proactive behavior, as demanded in [Cas96]. This is due to the rule- based 
character of the framework. 

- Because of the rule-based executions, agents can protocol parts of the system 
run, i.e. sequences of rule applications. Based on these data different strate- 
gies from artificial intelligence to improve their capabilities can be deployed. 

— The approach supports various kinds of dynamics, concerning the system ^s 
structure and its behavior at runtime. 

Of course, agent systems have many more potential advantages independent of 
a graph transformational modeling. Communication can help to resolve conflicts 
and enables cooperation. Furthermore, as for other concurrent systems, the 
use of agent systems can yield a speed up because of parallel execution and 
redundant agents. 
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Abstract. In this paper we present an approach that uses a formal 
specification formalism, namely graph grammars, to describe simulation 
models. The construction of the models is based on the concept of paral- 
lel composition of graph grammars, that is a kind of composition that is 
compatible with a true concurrency semantics. We provide the guidelines 
for describing and smoothly integrating the different aspects of a simu- 
lation model, namely the behaviour of the simulated system, the desired 
animation of the simulation and the statistics of the simulation. 



1 Introduction 

The area of discrete simulation has as main aims to validate and recognize per- 
formance aspects of a system. Although it is possible to perform tests on real 
systems, it poses pragmatical problems, specially when the system operation is 
dangerous or during its design. The use of discrete simulation allows to build 
controlled environments where different strategies may be tested while searching 
for the best solution. 

The first step to do a simulation of a system is to construct a model of the 
system to be simulated, which shall be as close as possible to the reality that 
is being modeled. A simulation model is composed of many parts. The main 
one describes the behaviour of the system, others are concerned with collecting 
data for statistics and visualizing the simulation of the system. There are a lot of 
environments for simulation. The simpler ones consist of class or routine libraries 
that must be used in conjunction with conventional programming languages such 
as C, C++, Pascal and others. Simulation languages such as GPSS, SIMULA or 
SimScript are high level tools that encompass the basic simulation concepts 
needed to build simulation models. Concepts like control of simulated time, 
statistics gathering have corresponding built in constructions in these languages. 
Finally, there are interactive development environments, both commercial ones 
such as MicroSaint, Arena, Taylor II and ModSim III, as academic ones such as 
VMSS, SMOOCHES and SIMOO [CPW97]. In most of these environments, the 
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behaviour of the model is described by programs written in various programming 
languages. Describing the behaviour of the components of a simulation system 
by programs has some serious drawbacks: 

— Although the model is a specification of the system to be constructed, the 
description is too low level to be used as a specification for the further 
development of the system. Moreover, typically the programs describing the 
models include statistics and animation procedures and it is hard to extract 
the behaviour model from this program (statistics and animation are only 
interesting for the simulation, but not for the development of the system); 

— It makes it hard to make proofs of behavioral properties of the system; 

— The reuse of simulation components in other models is more difficult; 

— If the platform/implementation language changes, all existing models may 
be lost, or have to be adequated/ re- implemented in a new language. 

In this paper we advocate the use of a formal specification method, namely 
graph grammars, to describe simulation models. This way, the model would not 
only serve as a basis for the analysis of performance issues, but also for the 
formal development and verification of behavioral properties of the system. The 
concepts presented here provide a basis for the re-implementation of the SIMOO 
environment. 

In the area of simulation of concurrent systems, graph grammars have suc- 
cessfully been used to simulate biological systems, like tissue and plant growth. 
The grammars used for this are based on Lindenmayer systems 
(L-systems) [Lin87,PL96], which describe discrete models of evolution that in- 
volve a great amount of parallelism. In fact, many kinds of L-systems have been 
modeled with graph grammars (see [Tae96] for an overview), taking advantage 
of the use of graphs instead of strings. Here, we will use graph grammars in a to 
specify models of simulation of general discrete distributed and concurrent sys- 
tems. This choice seems to be adequate because a graph naturally describes the 
distribution of a system and allows for the parallel application of rules in many 
of its subgraphs. Moreover, the parallel composition operator defined for graph 
grammars in [Rib96b,Rib99] can be used to allow a componentwise construction 
of simulation models in which the behavior of the whole system can be derived 
from the behaviors of its parts (that is, we assure that no side-effects occur when 
putting the components together). 

The formal basis of this work uses the algebraic approach to graph gram- 
mars [Ehr79,Low93,EHKT97]. This approach has been especially well-investi- 
gated in the area of concurrency and provides a smooth integration of graphs 
and abstract data types (what is very desirable for practical applications). 

Section 2 presents the components of a simulation model and the SIMOO 
environment. Sect. 3 is an (informal) overview of the basic concepts of graph 
grammars and introduces the concept of composition that will be used in the 
construction of simulation models in Sect. 4. In Sect. 5 we conclude our paper 
and give the directions of our future research in this area. A short version of this 
paper can be found in [CK98]. 
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2 Simulation Systems and the SIMOO Environment 

A discrete simulation model consists typically of simulation entities, discrete 
events and a stochastics behaviour associated to the transitions between states. 
In general, it is possible to describe a system using discrete as well as continuous 
structures. Models that are implemented on digital computers are inherently 
discrete. However, a variable whose domain is only limited by the capacities of 
the computational environment can be considered as continuous [Pri74]. 

According to Shanon [Sha75], discrete simulation models are collections of 
entities that interact with each other. Their descriptions have a static and a 
dynamic structure. The static structure defines the entities attributes and is 
often represented as a collection of data objects and their possible values. The 
dynamic structure defines how the states change over time. There are different 
aspects to consider in the dynamic definition [CPW96]. The first aspect is the 
way we represent the events that define the system’s state changes. The most 
widely used approaches to describes these events are the event approach, the 
activity approach and the process interaction approach. Pidd [Pid94], states 
that all tree approaches have in common the fact that they produce programs 
with a three-level hierarchical structure: 

Level 1: Executive (control program) responsible for sequencing the opera- 
tions which occur as the simulation proceeds. In most simulation environ- 
ments this level is hidden from the programmer. 

Level 2: Operations set of statements describing the operations performed 
within up the model. It constitutes the simulation program “proper” . 

Level 3: Detailed routines set of resources used by the second level to model 
the system of interest. It consists of resources for taking random samples, 
for producing reports, for collecting statistics, etc. 

The second aspect refers to the resources needed for communication between 
entities. Simulation entities communicate which each other for exchanging in- 
formation and tasks and for scheduling events. The most known communication 
techniques are the use of ports and messages. 

Models may also be described either from the point-of-view of client or server 
entities. However, the boundary between both approaches is not well defined and 
there are several problems that cannot be modeled using only one point of view. 
Because of this, it is usual to adopt a hybrid approach. This is the approach 
taken in the SIMOO environment. 

SIMOO [CPW97] is an integrated environment for modeling and simulation 
of discrete systems. It is composed by two main modules: the class library and the 
model editor. The editor supports the description of static and dynamic aspects 
of the model. Structural aspects are described graphically, while the dynamic 
behavior of each entity is described in C++ using library resources. From the 
model description the editor automatically generates the necessary executable 
code. 
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The executive level of SIMOO implements the event oriented approach and 
support the use of messages for the communication between entities. Thus, events 
and communication by messages are the basis of all paradigms SIMOO supports. 
Other approaches such as process oriented and communication by ports can also 
be modeled [CPW97]. 

3 Graph Grammars and Their Composition 

Graphs are a very natural means to explain complex situations on an intuitive 
level. Graph rules may complementary be used to capture the dynamical as- 
pects of systems. Graph grammars generalize Ghomsky grammars from strings 
to graphs. Rather than using plain graphs merely consisting of vertices and edges 
we use typed and attributed graphs making specifications more natural, com- 
pact, and easy to survey. Attributes are specified using abstract data types (the 
concept of attributed graphs was defined in [LKW93]). Moreover, we use another 
typing scheme to allow “graphical types”, called typed graphs. Thus, a graph 
grammar consists of a type graph (that is an attributed graph describing which 
kind of vertices and edges are allowed in the grammar), an initial graph (that 
describes the initial state of a system) and a set of (graph-)rules (describing the 
possible states changes in the system. 

The following interpretation ofaruler:L— provides the basis for this 
specification approach: items in L which do not have an image (according to r) 
in R are deleted; items in L which are mapped to R are preserved; items in R 
which are not in the image of r are created. The behavior of a system described 
by a graph grammar is based on the application of rules to actual graphs. The 
application of a rule to an actual graph, called derivation step, is possible if there 
is an occurrence of the left-hand side of this rule in the actual graph. The result 
of the application of a rule r : L — R to a graph IN can obtained by three steps 
(according to the Single-Pushout approach to graph grammars): 1. Add to IN 
everything that is created by the rule; 2. Delete from the result of 1 everything 
that shall be deleted by the rule; 3. Delete dangling edges. 

By allowing parallel applications of rules we can define concurrency semantics 
for graph grammars. In our approach, we allow parallel applications of rules that 
do not delete the same items. Thus, two processes having read-access to the same 
resources may be executed in parallel. Note that graph grammars allow for much 
more parallelism in the system than other specification formalisms (like classical 
Petri nets) that do not allow the preservation of items [KR96]. 

The idea of the composition introduced in [Rib96b,Rib99] is based on a top- 
down development of the system: first an abstract description of the components 
and their interconnections is fixed, then each component is specialized separately 
and at the end they are put together. An important aspect of this kind of com- 
position is that it is compositional with respect to a true concurrency semantics 
of graph grammars (the unfolding semantics). 

^ Formally, a rule is a graph homomorphism between L and R. 



Compositional Construction of Simulation Models Using Graph Grammars 



91 



Abstract View = GGO 

Specia^^ s2 

Component 1 = GGl (pb) Component 2 = GG2 



System = GG3 

To build a system using this kind of composition we shall first define a gram- 
mar GGO that represents a description of a whole system, called abstract view. 
Then this grammar shall be specialized in different ways, giving raise to gram- 
mars GGl and GG2 (or more, if there are more components). These specializa- 
tions may be done in two ways: i) by adding new items to rules/types/initial 
graph to the ones in the abstract view, ii) by adding new rules. The composi- 
tion of GGl and GG2 with respect to GGO is constructed as a union of these 
three grammars: the type and initial graph of the composition are union of the 
corresponding type and initial graphs, and the rules are the rules obtained as 
the union of corresponding rules in GGO, GGl and GG2 (amalgamated rules), 
the rules that are in GGl and GG2 and not in GGO and the parallel rules ob- 
tained from the latter ones. The amalgamated rules put together the different 
specializations made in GGl and GG2 of the same rule of GGO. 

Such an abstract specification of the system used as interface is usually nec- 
essary for the development of a system. It serves for the communication between 
members of different teams, as well as makes explicit the changes that affect 
other components. Whenever there is a specialization relation between a com- 
ponent and the abstract view, this component is a safe extension of the abstract 
view, what implies that the composition of this component with other ones with 
respect to this abstract view will not show any unexpected behaviour. 

4 Specifying Simulation Models with Graph Grammars 

Generally speaking, a simulation model specified by graph grammars will be 
composed by a set of types with its attributes, representing the static structure of 
the simulation entities and a set of rules for each type, representing the dynamic 
structure or the behavior of each kind of entity. If we analyze simulation models 
carefully we will realize that all of them share some types and rules. These are the 
types and rules that describe the executive level defining not only the approach to 
describe the events but also the resources for communication between entities. 
This way a simulation tool that allows the specification of simulation models 
using graph grammars must provide an initial set of types and rules. This way 
the tool implements the first level of the Pidd’s hierarchy. The second level is 

^ Formally, the composition is defined as a pullbacks (for the formal definition and 
examples see [Rib96b,Rib99]) 
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the model itself, that, in our approach, will be given by specializing the graph 
grammar modeling the first level (that is, adding new types and rules that are 
specific for the system being modeled). The third one, the detailed routines, 
may be provided by a set of predefined abstract data types that implement the 
resources necessary at this level like generation of random numbers, statistics 
collection, generation of reports etc. 

Figure 1 summarizes our approach to constructing simulation models using 
graph grammars. The arrows have the following meaning: full lines represent 
specializations and dashed lined represent composition (formally, the arrows go 
in the opposite direction). 




Fig. 1. Construction of a simulation model 

In this figure the grammar SimuGG is the description of the executive level 
of the simulation model: it models which kind of attributes each simulation 
entity must have and the control structure of the simulation model (specifying, 
for example, how messages are sent, time variables, etc.). The simulated time is 
modeled as an attribute of a main component of the system, and the stochastic 
functions are operations on the time data type. From this basic model, the user 
may build its specific simulation model. The first step towards this model is 
to refine the type graph SimuGG by defining the problem specific simulation 
entities and the messages that are relevant for this model. The user shall add new 
rules to SimuGG specifying the abstract behavior desired in the system being 
modeled. Here we can only define an abstract behavior because the attributes 
of each component of the system are not yet specified (we can describe rules 
saying, for example, that in reaction to some message Ml, a simulation entity El 
sends messages M2 and M3 to entity E2). This specialization gives raise to the 
grammar Abs Models that describes the simulation model in a very abstract 
way (without the internal information of each component). The advantages of 
defining such an abstract description of the system are twofold: on the one hand, 
this can guide the definitions of each component, because the rules that shall 
be defined for each one are already sketched; on the other hand, this provides 
a formal basis for a compositional semantics, giving the possibility to check 
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whether the changes that are being made in some component will or will not 
modify the (desired) behavior of the system as a whole. 

The next step is to specify each component of the system (in the diagram, 
just two components were drawn, but the number of components is arbitrary). 
Typically, the most important part here is to specify the behavior of the com- 
ponent. This is done by specializing the part that concerns this component in 
the grammar AhsModel^ giving raise to the grammar CompBeh. Then we may 
also want to specify the kinds of statistics that are relevant to be generated 
during the simulation of this component. For this we will define the grammar 
CompStat. Analogously, the graph grammar CompAnim describes how the ani- 
mation of the simulation shall be done. For example, the receipt of a message Ml 
by an entity El may generate three rules: one in CompBeh describing which at- 
tributes may change and which messages shall be sent in reaction to Ml, one in 
CompStat describing how the receipt of this message shall influence the statis- 
tics generated for this simulation model, and one in CompAnim describing how 
the receipt of this message shall influence the graphical animation of the simu- 
lation model. Obviously, not every message will generate statistics or graphical 
animation, but the behavior of the system when receiving this message must 
be specified. Then, a complete description of each simulation component can 
be obtained by composition of the statistics and animation grammars using the 
behaviour grammar as interface grammar. The parallel composition construction 
guarantees, for example, that any procedure used to get statistics will not influ- 
ence the behaviour of the system. After all components have been specified, we 
may put them together also using parallel composition. 

5 Conclusion and Future Work 

In this paper we have presented a way of constructing discrete simulation models 
for concurrent systems using graph grammars. Due to the rule-basedness of this 
formalism, reactiveness can be specified in a natural way. The use of graphs 
and the possibility of applying many rules in parallel makes the specification of 
concurrency quite natural. The provided construction assures that the behaviour 
of each component will not be affected by statistics or animation procedures, and 
also that the behaviour of the whole system will be obtained by composing the 
behaviors of its components. A new simulation environment, called FLATUS, is 
currently being implemented based on the concepts described here. 

We have been using graph grammars here as an unambiguous and high-level 
language for the specification of the simulation models, and the parallel com- 
position to give hints on which kinds of operations shall be forbidden in the 
construction of the components to assure that the behaviour of the system will 
be preserved. The main reason for choosing graph grammars as a specification 
formalism for simulation models is that, besides being formal, it is quite intu- 
itive even for people not used to formal description languages and techniques. 
This was a requirement on the modeling language, and is a main advantage of 
graph grammars comparing to other simulation model formalisms like Markov 
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chains [14i82], stochastic process algebras [Gt95], or even Petri nets. Although 
these latter formalisms provide more sophisticated analysis techniques, the con- 
struction of a wide range of models is rather complex for non-theoreticians. 
Moreover, rich state-based descriptions as provided by graph grammars capture 
better the dependencies between the events within the system (in comparison 
with Petri nets, there are semantical models for graph grammars that can de- 
scribe a richer set of phenomena occurring in concurrent systems [Rib96b]). 

A very important feature is to allow formal verification of properties of the 
system being modeled. The first step in this direction has already been done, 
namely to make a formal specification of this system. Currently we are investi- 
gating which kind of properties and techniques can be used to verify the models. 
We are also investigating ways to get complexity measurements of the models. 
Another interesting work is to introduce a module concept within this environ- 
ment to improve the possibility of reusing components. 



References 



Cop97. 

CK98. 

CPW96. 

CPW97. 

EHK-h97, 



Ehr79. 

Gt95. 

KR96. 

Lin87. 

LKW93 



B. Copstein, SIM 00 : Plataforma orientada a ohjetos para Simula pdo disc- 
reta multi-paradigma^ Ph.D. thesis, Eederal University of Rio Grande do 
Sul, 1997. 

B. Copstein and L. Korff, Specifying simulation models using graph gram- 
mars^ ESS’98 10th European Simulation Symposium, SCS, 1998, pp. 60-64. 
88 

B. Copstein, C. E. Pereira, and E. Wagner, The object oriented approach and 
the event simulation paradigms^ 10th European Simulation Multiconference, 
SCS, 1996. 89 

B. Copstein, C. E. Pereira, and E. Wagner, Simoo - an environment for the 
ohj etc- oriented discret simulation^ 9th European Simulation Symposium & 
Exhibition - Simulation in Industry, SCS, 1997. 87, 89, 90 
H. Ehrig, R. Heckel, M. Korff, M. Lowe, L. Ribeiro, A. Wagner, and A. Cor- 
radini. Algebraic approaches to graph transformation II: Single pushout ap- 
proach and comparison with double pushout approach^ The Handbook of 
Graph Grammars, vol. 1: Eoundations, World Scientific, 1997, pp. 247-312. 
88 

H. Ehrig, Introduction to the algebraic theory of graph grammars^ Lecture 
Notes in Computer Science, vol. 73, Springer, 1979, pp. 1-69. 88 
N. Gdtz et alii.. Constructive specification techniques - integrating functional 
performance and dependability aspects^ Quantitative methods in parallel sys- 
tems, Springer, 1995. 94 

M. Korff and L. Ribeiro, Formal relationships between graph grammars and 
Petri nets^ Lecture Notes in Computer Science, vol. 1073, Springer, 1996, 
pp. 288-303. 90 

A. Lindenmayer, An introduction to parallel map generating systems^ Lec- 
ture Notes in Computer Science, vol. 291, Springer, 1987, pp. 27-40. 88 
M. Lowe, M. Korff, and A. Wagner, An algebraic framework for the transfor- 
mation of attributed graphs^ Term Graph Rewriting: Theory and Practice, 
John Wiley & Sons Ltd, 1993, pp. 185-199. 90 



Compositional Construction of Simulation Models Using Graph Grammars 



95 



Ldw93. M. Lowe, Algebraic approach to single-pushout graph transformation^ The- 
oretical Computer Science 109 (1993), 181-224. 88 

Nag91. M.Nagl (organizer). The use of graph grammars in applications^ Lecture 
Notes iun Computer Science, vol. 532, Springer, 1991, pp. 41-60. 

Pid94. M. Pidd, An introduction to computer simulation^ Winter Simulation Con- 
ference 1994, SCS, 1994. 89 

PL96. P. Prusinkiewicz and A. Lindenmayer, The algorithmic beauty of plants^ 
Springer, 1996. 88 

Pri74. A. Pritsker, The GASP IV simulation language^ John Wiley and Sons, 1974. 
89 

Rib96b. L. Ribeiro, Parallel composition and unfolding semantics of graph gram- 
mars^ Ph.D. thesis. Technical University of Berlin, Germany, 1996. 88, 90, 
91, 94 

Rib99. L. Ribeiro, Parallel composition of graph grammars^ Applied categorical 
structures 7 , no. 4 (1999), 405-430. 88, 90, 91 

Sha75. R. E. Shannon, Systems simulation, the art and science, Prentice Hall, 1975. 
89 

Tae96. G. Taentzer, Parallel and distributed graph transformation: Formal descrip- 
tion and application to communication-based systems, Ph.D. thesis. Techni- 
cal University of Berlin, 1996. 88 

Tri82. K. Trivedi, Probability and statistics with reliability, queuing and computer 
science applications, Prentice Hall, 1982. 94 



Graph-Based Reverse Engineering and 
Reengineering Tools 



Katja Cremer 

Department of Computer Science III^ 
Aaclien University of Technology 

kat ja^iS . inf ormatik . rwth-aachen . de 



Abstract. In this paper a reengineering approach is presented which 
uses graph transformations as formal background. The term reengineer- 
ing describes any kind of activities concerned with the renewal and im- 
provement of existing (software) applications. For this purpose the struc- 
ture of an application has to be recovered to get information about rel- 
evant components and their relations. In this context graph rewriting 
systems are used to specify objects and relations and to determine pos- 
sible changes in order to improve the system. 



1 Motivation 

In a joint project^ with two German companies the problem of migrating exist- 
ing applications into distributed environments was investigated. For many ap- 
plications it is necessary to use them no longer on central mainframes but in a 
heterogeneous environment on different computers. There are many languages, 
methods and tools to support the development of distributed applications. Pop- 
ular representatives (e.g. [1, 17,21]) are middleware products conforming to the 
CORE A standard [14]. 

However, many existing applications cannot use these technologies without 
modifications, because of the lack of structure. A prerequisite for distribution is 
the existence of logical units (modules, subsystems) which can serve as cutting 
lines. First this precondition must be satisfied before the use of middleware is 
possible. The main task is to divide an existing application into portions which 
are able to interact as client and server parts. 

When restructuring an existing system for being distributed other goals are 
achieved as well: The new system should be clear in its architectural structure, 
maintainable and, especially, extensible, and possibly be ported to a new lan- 
guage version. 

In our case we had to deal with COBOL programs. The data handling and 
application functionality should remain on a server, the user interface and local 

^ This project was founded by the German Ministry for Research and Education and 
the companies AMI and GEZ. The dissertation of K. Cremer [4] received the Software 
Engineering Price 1999 from the Ernst Denert foundation. 



M. Nagl, A. Schiirr, and M. Miinch (Eds.): AGTIVE’99, LNCS 1779, pp. 95-109, 2000. 
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checks should be placed on client computers. Furthermore, not all parts of an ap- 
plication can be reused. Some portions like the character-oriented user interface 
are usually replaced completely. 

This paper describes the task of preparing existing applications for their 
reuse in distributed environments by means of graph rewriting systems. Another 
problem is the use of concrete middleware products. This task has been treated 
in another sub-project, where similar techniques have been applied (cf. [16]). 

The remainder of this paper is organized as follows: In section 2 the de- 
veloped strategy is presented. Section 3 describes the tool environment which 
supports the task of preparing existing applications for distribution. In section 
4 the development process of the tools is elaborated. In particular the use of 
graph rewriting techniques and the use of the PROGRES system are presented. 
The adaptability of the approach is presented in section 5. Section 6 gives an 
overview about related work. Some conclusions are given in section 7. 



2 Approach for Reverse and Reengineering 

In this section the approach for reverse and reengineering (R&R) is described. 
Fig. 1 outlines the necessary steps to migrate an existing system into a dis- 
tributed one. 



system structure document architecture document 




Fig. 1. Reverse and reengineering approach 
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Reengineering starts with the complete souree eode of the concerned applica- 
tion. We have chosen this starting point because source code is the only reliable 
resource of information. Additional documents like manuals, reports, or techni- 
cal documentation are informal (with respect to format, contents). Therefore, it 
is almost impossible to decide whether they are consistent with the source code 
documents or not. 

In the project we have examined applications which are written in COBOL 85 
[22]. Sample fragments of COBOL programs are shown in fig. 1. The concrete 
programming language does not influence the approach, but the concrete imple- 
mentation of supporting tools. Implementation, however, can be applied to other 
languages as we will see. The applied R&R method consists of three steps, which 
are described in the following: 

(1) All activities which create descriptions on a higher level of abstraction as 
the source code level are summarized by the term reverse engineering. The result 
of this step (sketched on the left hand side of fig. 1) is a document, called system 
strueture doeument^ including information about relevant components and their 
interrelations. The system structure document represents the current state of an 
application. We use an architecture description language (ADL) [8] to specify 
the logical units which are necessary for distribution purposes. 

The reverse engineering proeess searches source code documents for com- 
ponents like files, programs, data structures and procedure parts and also for 
relations like “is-contained-in”, “uses” and “imports”. The occurrences of cor- 
responding source code artifacts are saved in the system structure document. 
The extracted information is stored in form of a graph (top left in fig. 1), com- 
ponents are mapped onto nodes, relations are presented by edges. The concrete 
souree eode artifacts are not present in the graph presentation, but every graph 
node and edge has attributes containing information about the location of the 
source code artifacts. Thus the correlation between the abstract system structure 
document and the concrete source code level is maintained. 

(2) The next step of our strategy is a re-design from the system structure 
to independent architectural units. The aim of the re-design is to constitute an 
assignment between parts of the system structure document and architectural 
units. In addition, data and control dependencies of the old system are “local- 
ized” by putting the corresponding code fragments into architectural units. For 
these units the interfaces have to be defined. The created architecture units are 
stored in the architecture document (top right in fig. 1). 

(3) Both, the system structure and the architecture document are abstrac- 
tions from concrete source code in a specific programming language. Re-design 
has to be combined with changes of the source code to transmit the adaptation 
from the abstract (architecture document) to the concrete level (target source 
code). In our case the target language was COBOL 9x. The corresponding source 
code transformations are sketched in fig. 1 by the arrow with the description re- 
code and by the connection to the re-design step. 

Of course, not every programming language contains concepts to express 
architecture specifications. However, in [12] it is argued that it is mostly possible 
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to ^^simulate^^ architectural descriptions within source code. The effort depends 
on the age and the type of programming languages and there limits (e.g. trying 
to map an object-oriented architecture to an old FORTRAN version). 

The aim of the described approach is to couple re-design and re-coding. The 
preparing steps for a distribution of an application are planned and performed 
on the architectural level. These changes initiate operations for adaptations on 
the concrete source code level. The strategy provides no support for changes 
only on source code. It is not possible to transfer all kind of changes from the 
design into the source code level automatically. In some cases a post processing 
is necessary. 



3 Reverse and Reengineering Tools 



In this section four tools are presented which support the described approach. 
The tools build up an integrated environment and have a common user interface. 
The following fig. 2 shows a screen dump of the ReForDi (Reengineering For 
Distribution) tool prototype. 




Fig. 2. Integrated tool environment 
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In the main window on the left hand side a part of a system structure doc- 
ument is shown. The part surrounded by the dashed rectangle is assigned to an 
architectural unit (shown on the right hand side). This assignment is explicitly 
expressed by nodes and edges in the middle of the main window. From the com- 
mon graphical user interface shown in fig. 2 all tools can be started. We now 
describe which functionality is offered by the tools of the environment and how 
the tools can be applied. The tools are related to the above steps (1-3). 

(la) Design Recovery Tool: This tool searches for particular source code artifacts 
which are relevant for extracting the structure of an existing application. The 
appearance of such artifacts is stored as a textual graph description. In the 
current implementation the analysis of COBOL 85 programs is supported. This 
tool is called from the menu File of the common user interface. 

(lb) Visualization Tool: This tool is responsible for the graphical representation 
of the textual information created in the analysis phase. Basically, the visual- 
ization tool is a graph browser which offers the definition of different views. 
This tool together with the analysis tool are summarized by the term reverse 
engineering tool 

(2) Re-Design Tool: Based on the descriptions created and visualized by the 
reverse engineering tool the re-design tool offers operations to change the struc- 
tural properties of an application. The aim of the changes is the assignment of 
parts of the system structure document to new architectural units. The opera- 
tions are divided into two groups: manual operations and re-design techniques. 
The manual operations are used to restructure an application step by step, and 
the complete re-design process being controlled by a human expert. In addition, 
the re-design tool offers four different re-design techniques which automatically 
assign components of the system structure documents to architectural units. The 
application of one of the different techniques depends on the structural proper- 
ties of the considered software system. The tool operations are called from the 
menus ManualJle-Design and Re-DesignJllgorithm. 

(3) Source Code Transformation Tool: The re-design transformations are exclu- 
sively performed on the architectural level. To keep concrete source code docu- 
ments consistent with the new architecture, this tool offers coupled source code 
transformations. They are performed when the re-design is finished. The op- 
erations of this tool are offered by the menu Source.Transformations in fig. 
2 . 

4 Tool Development Process 

The presented tools are not coded manually. Instead, tree and graph transforma- 
tion are used which enable the reuse of the tool development process for other 
R&R tasks. Tree transformations are used to create the necessary abstractions 
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(reverse engineering) but also to perform the source code transformations. Graph 
transformations specify the possible changes on architectural level. The use of 
these transformations for the development of R&R tools is described in detail in 
the following subsections. 



4 1 Tree Transformations for Reverse Engineering 

The realization of the design recovery and the source code transformation tool 
((1) and (3) in fig. 1) is based on the transformation language TXL (Turing 
extender Language [2,3]) which supports textual transformations. Fig. 3 shows 
the application of tree transformations for reverse engineering purposes. 



source 

code 
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TXL-transfornnations 



unparsing 




part of the parse tree 
structure relevant 
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Fig. 3 . Tree transformation for reverse engineering 



TXL is based on the transformation of textual input into textual output. 
By means of the TXL language it is possible to describe the syntax of the 
input. Based on this information TXL is able to create a simple parser. In addi- 
tion rules can be specified which define transformations on the parse tree. The 
transformations are executed by the TXL transformation engine. Furthermore, 
transformation rules add information to a parse tree. This additional informa- 
tion represents knowledge about source code artifacts which is relevant for the 
recovery of the structure of an application. TXL is also responsible for unparsing. 
The added information is collected and forms a textual graph description. Fig. 
4 shows an extract of a TXL specification. 

The current implementation of the design recovery tool creates abstractions 
from programs written in COBOL 85. The TXL specification of fig. 4 is respon- 
sible for the detection of PARAGRAPH constructs in COBOL. An occurrence of 
this kind of source code artifact adds appropriate information to the parse tree. 
In the following, we call this information facts. In the upper part of fig. 4 the 
syntax of the considered source code artifacts is shown, in the middle part the 
fact which is added for every occurrence of a PARAGRAPH. The fact keeps infor- 
mation about the name, the location (in form of line numbers) and the file a 
source code artifact is contained in. 
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definitions of textual input and outpu t 

define Paragraph 

[LineNumber] [Cobolld] . 

[ ParagraphCoitimands ] 

I 

[LineNumber] [Cobolld] . 

[ ParagraphFact ] 

[ ParagraphCoitimands ] 

end define 



define ParagraphCoitimands 
[repeat CobolCommand+] 
end define 



define Cobol Command 

[Accept Command] 

I [DisplayCommand] 
I [CallCommand] 

• • • 

I [ComputeCommand] 
end define 



definition of the added information 

define ParagraphFact 

$ Parald ( [ ProgramName ] , [Cobolld] , [Cobolld] , [LineNumber] ) 

end define 



transformation rule 

rule FindParagraph Prog [ProgramName] Sect [Cobolld] 
replace [Paragraph] 

z [LineNumber] id [Cobolld] . 
paracom [ParagraphCoitimands] 

construct Fact [ParagraphFact] 

$ Parald ( Prog, Sect, id, z ) 

by 

z id . 

Fact 

paracom 

end rule 



Fig. 4. Example of a TXL specification 



In the bottom part of fig. 4 the transformation rule is presented which adds 
the fact to the parse tree. The rule comprises two parts: The first part (replace) 
defines the search pattern which describes considered source code artifacts. The 
second part (by) specifies the additions on the parse tree. 

There are 24 TXL-transformations rules in total which detect structurally 
relevant source code artifacts and add facts to the parse tree. The unparsing 
step collects these parts of the parse tree resulting in a knowledge base containing 
all relevant facts about source code artifacts and being interpreted as a textual 
graph description. 
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4.2 Graph Transformations for Re-Design 

We have argued that the system structure and the architecture document are 
internally stored as complex graphs. Both, the approach and the realization 
of the re-design tool are based on the use of graph technology (for a detailed 
description see [13]). This means that 

— all occurring documents are modeled as special types of graphs, 

— every operation which can be performed on a document is defined as an 
operation which changes the underlying graph, 

— a tool implementation can be derived from the specification of the graph 
type and its operations. 

We use the VHL-language PROGRES [20, 18, 24] to develop tools based on 
graph technology. PROGRES offers constructs to define graph types and oper- 
ations on it. In fig. 5 the graph types are shown which describe the internal 
structure of the considered documents. 



graph type of the 
system structure document 

• • • 


graph type of the 
correspondence document 

• • • 


graph type of the 
architecture document 

• • • 


COBOL_OBJECT COBOL_RELATIONSHIP 


MAP_OBJECT MAP_RELATIONSHIP 


ADL_OBJECT ADL_RELATIONSHIP 








A r 

MAP_ITEM 







common 
root schema 





X X 

OBJECT 



from_src ^ ^ 

^ 1 RELATIONSHIP I 
to_trg I ^ 1 






ITEM 



Fig. 5. Definition of the graph types 



The lower part of fig. 5 shows the basic graph scheme in PROGRES notation 
on which the graph schemes of the system structure and the architecture doc- 
ument are based on. Additionally, a third graph type is shown, the graph type 
of the correspondence document. This document type contains the assignments 
which are established during the re-design between parts of the system struc- 
ture and the architecture document. The connection to the source code level is 
modeled by the node type FILE_ITEM. 

In the PROGRES notation rectangles represent node classes and rounded 
boxes render node types (not shown in fig. 5). Dotted arrows depict inheritance 
relations, edge types are represented by solid arrows. The basic scheme divides 
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all node and edge types of the system structure and architecture document in 
two categories: objects or relations. Relations are modeled as an edge-node-edge 
pattern. This kind of modelling offers the possibility to attach attributes to 
relations. The scheme of the correspondence document inherits from the node 
class ITEM of the basic scheme, because assignments are connections between 
objects and relationships. By this way of modelling we consider the special role 
of the correspondence document. 

Having introduced the different kinds of concerned documents, we now can 
describe the re-design process as transformation from one document type to 
another. We have to define operations not only on one graph type but on different 
ones and we have to establish relations between them. In the concrete scenario we 
have to realize the transformation between subgraphs of the system structure 
to those of the architecture document. Relations between the two documents 
established during the transformation process are stored in the correspondence 
document. 

To handle this we use triple graph grammars [19, 10] for the specification of 
the re-design operations. The idea of triple graph grammars is based on pair 
grammars introduced by Pratt [15] in 1971. Pair grammars define a context free 
(graph) grammar for every involved document. These grammars are coupled by 
a definite mapping between different rules. A node (left side of a rule) of one 
grammar is mapped to a node of the other grammar. This principle is adopted 
by triple graph grammars, i.e. graph grammars are specified for the involved 
documents and correspondences are defined between the rules. In contrast to 
pair grammars triple graph grammars allow m:n relations between rules and the 
nodes and edges of the rules. 

In fig. 6 an example for a rule of a triple graph grammar in PRO GRES 
notation is shown. This rule specifies a possible correspondence between the 
system structure and the architecture document. In this triple graph rule the 
solid parts represent the left hand side and the dashed portions express the right 
hand side of the rule. 

From every triple graph rule three transformations can be derived. Forward 
rules define transformations from the document type on the left hand side (in 
fig. 6 the system structure document) to the document type on the right hand 
side (here the architecture document). Backward rules specify transformations 
in the opposite direction and correspondence rules establish relations between 
existing parts of the participating document types. For the specification of the 
re-design transformations the forward rules are of importance. In this context 
the main task of re-design transformations is the identification of portions of the 
system structure document which can be correlated to architectural units. 

The triple rule shown in fig. 6 is such a forward rule which searches for a node 
of the class PROCEDURE and establishes a connection to a node of the type Method. 
The node class PROCEDURE summarizes the node types Section and Paragraph 
(not shown in fig. 6) which abstracts from procedural parts of COBOL programs. 
The triple rule assigns procedural parts of a regarded program to a method which 
belongs to a class. To express this assignment the dashed lined nodes 9, 10, 11 
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Forward_Rule Map_Procedure_to_new_Method ( proc_node : PROCEDURE : cl_node : Class ) = 




transfer 1 0 . transformed := false ; 
10.OO_File := " " ; 

end; 



Fig. 6. Triple graph rule for re-design (forward version) 



are added. The idea behind this transformation is the reuse of procedural code 
fragments. 

By means of triple graph rules the possible re-design operations are speci- 
fied. An implementation in C is generated by the PROGRES environment. The 
specified rules can be activated by a human expert who has to decide which 
transformation is reasonable and has to control the whole re-design process. 
Additionally, four re-design techniques have been specified which can be inter- 
actively invoked, and which can be applied depending on the original structure 
of a software system. These re-design techniques combine different triple graph 
rules into complex graph transactions. 
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4.3 Graph Traversal and Tree Transformations for Re- Co ding 



Until now only changes on the abstract level are considered. However, an essential 
aim of the presented approach is the combination of text and graph representa- 
tions. Fig. 7 sketches the coupling of the different levels of abstraction. 




Re-Implementation 



Fig. 7. Combined Re-Design and Re-Coding 



After the re-design process is finished the architectural description is trans- 
formed into text frames. These frames are enriched with existing source code 
fragments. The access is possible via links to the source code in attributes of 
the system structure graph. In these attributes the location of source code arti- 
facts is stored. The frames and the reused source code parts form the so called 
intermediate eode which is the basis of the proper creation of source code. 

During the souree eode ereation the annotations are replaced by source code 
constructs of a concrete target programming language. The reused source code 
parts are adapted. The intermediate code is created by a graph traversal, every 
traversed node and edge is transformed into textual form and the connected 
source code parts are included after having been adapted. The source code cre- 
ation is based on the use of TXL, using the same techniques as the design 
recovery tool described above. Here, textual parts of the intermediate code are 
replaced by concrete source code parts of a specific programming language. 
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5 Adaptability for other Reengineering Problems 

As already mentioned before the predefinitions for the tool behaviour are not 
hard coded but specified by means of TXL and PROGRES. The executable 
tool code is generated automatically from the specification documents. These 
specification documents can be used as a source for adaptations to influence the 
tool behaviour. 

The use of generator technology gives the tool developer the possibility to 
change the underlying programming language or language version for reengineer- 
ing. Furthermore, the goals of reengineering can alter and the transformations 
on the source code and the design level have to be adapted. Fig. 8 sketches 
the already described tools and their generation parameters, i.e. the different 
specification documents. 




graph pass & 
reuse of existing 
source code 



Fig. 8. Generated tools and their defaults 



In fig. 8 the generated tools are represented in bold face. Furthermore, the 
types of documents processed by the tools are shown. The grey boxes depict the 
documents which are used as generation parameters. These documents are the 
input for TXL and PROGRES generators. 

First, fig. 8 suggests the adaptability parameters of the design recovery tool. 
It can be specified which source code artifacts are mapped onto which graph 
nodes and edges. These predefinitions are used to generate a parser which creates 
graph descriptions. Furthermore, the graph types of the system structure and 
the architecture document are customisable. Also the tool developer has the 
possibility to define the source code and design transformations. 
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Considering all adaptation facilities the following tool development process 
can be achieved: 

1. definition of the relevant source code artifacts by means of a TXL grammar 

2. determination of the graph types of the system structure and architecture 
document using PROGRES graph schemes 

3. definition of the mapping between structurally relevant source code artifacts 
and graph nodes and edges of the system structure document, this definition 
is done by means of TXL transformation rules 

4. predefinition of re-design transformations using triple graph rules 

5. definition of source code transformations by means of TXL transformations 

6. determination of the coupling between the different transformation types 
with the help of a PROGRES specification 

Examples of definitions for the first four steps have been shown in section 4. 



6 Related Work 

In the context of reverse engineering and reengineering many projects can be 
found, but only a few of them are committed to the area of preparing appli- 
cations for distribution. The basic ideas for this preparation are established in 
the software engineering community and there has been done a lot of work (e.g. 
see [11])- However, the implementation of these concepts using PROGRES and 
TXL is a new approach. 

There are some other projects in the field of reverse engineering and reengi- 
neering. Rigi [23] is a popular interactive tool for reverse engineering. Source 
code artifacts are represented as graphs. Rigi offers good concepts for the visu- 
alization of complex graph structures. The tool user is supported by additional 
composition algorithms. Rigi is a pure reverse engineering tool, the connection of 
source code artifacts to the represented graphs is lost. Besides the composition 
algorithms Rigi offers no other graph manipulation operations. 

COBOLT [6] is an approach which supports the stepwise conversion of 
COBOL legacy systems into an object-oriented form. COBOLT is one of the 
few tools for this task. Until now the transformation process is opaque, i.e. the 
transformation knowledge is not accessible and customisable. 

In the GUPRO project [9] a generator for multi-language program under- 
standing tools has been developed. Source code is mapped onto a graph repre- 
sentation defined by so called concept diagrams (similar to PROGRES graph 
schemes). By using a query language the user can achieve a better understand- 
ing of the program. The main advantage of this approach is the adaptability to 
different programming languages. However, the lack of graph representations is 
a disadvantage of the GUPRO approach. Graph manipulations are subject of 
ongoing work. 

One project which uses graph transformations and triple graph rules as well 
is the VARLET project [7]. The aim of this project is the migration of database 
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schemes from a relational to an object-oriented paradigm. The mapping is spec- 
ified by triple graph rules. Another project in the context of database reengi- 
neering is the DBM AIN project [5]. 



7 Conclusion 

In this paper we have presented a reverse and reengineering approach and sup- 
porting tools. In particular the tool development process was described. The aim 
of the approach supported by these tools is the migration of existing applica- 
tions to a distributed environment of components communicating via middleware 
products conforming to the CORBA standard. 

The first step of the underlying approach provides the reverse engineering 
of the concerned application. This is done with the help of the specification 
language TXL. The recovered design information is visualized as a special type 
of graph. The system structure graph type is defined in the PROGRES language 
and the acquired facts are mapped onto graphs of this type. The system structure 
graph contains information on a higher level of abstraction than the source code. 

The graph is not only used for reverse engineering purposes but also as a base 
for the re-design of an application. The re-design transformations are defined as 
triple graph rules and have the purpose to assign parts of the system structure 
document to an object-like structure in the architecture document. In addition 
to the pure triple graph rules complex graph transactions are available which 
support the whole re-design process. 

Every transformation on the structural level must have corresponding trans- 
formations on the source code level. The transformations are performed such 
that middleware products can be used for distributing the resulting program. 

The approach can be applied for many reverse and reengineering problems. 
Thereby, different modifications steps have to be carried out in the mechanical 
tool construction process. 
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Abstract. A suitable software architecture -for example in the area of distributed 
application- can be composed of known-to- work solutions. These are also known 
as design patterns. However, there is little tool support for the construction of an 
application that conforms to design patterns. In most cases, the patterns are cap- 
tured informally as descriptions in natural language. Our approach uses graph 
queries and graph rewriting rules to specify the patterns. A prototype that is able 
to execute these rules can be generated from the graph grammar specification 
by means of the PROGRES environment. The advantages of this approach are 
twofold: (1) patterns are specified on a high level of abstraction and (2) the re- 
sulting tools can be easily adapted to new patterns. 



1 Background 

The work presented here is derived from a project, which we started 1995 at the tech- 
nical university Aachen. The project (funded by the BMBF) includes two industrial 
partners (Aachener and Miinchener insurance company and the GEZ - an institution 
charging TV fees). The project aimed at tools supporting the distribution of mono- 
lithic applications. The resulting application should employ a middleware like CORBA 
[OMG98] or DCOM [Ses97]. We divided this task in two subgoals: (1) transformation 
towards an object-based structure (if not already existent), and (2) distribution of the 
application. The latter implies a restructuring of the application to meet certain prere- 
quisites of the middleware, for example the indirection of remote object instantiation 
via ei factory. Thus, we check conformance with distribution prerequisites. If these are 
not fulhlled, we apply known solutions through a sequence of transformations. This 
task could also be interpreted as the application of a design pattern. 

The ability to apply patterns is not only useful in the original project context, i.e. dur- 
ing the re-engineering of an application. It is also suitable during the development of a 
new application, i.e. during forward engineering. In the context of the project, we in- 
vestigated patterns related to distribution. The approach and a tool we will present now. 
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supports these patterns. However, it is easily extendible for new patterns (not necessar- 
ily in the context of distributed systems). 

To summarize, we developed a tool supporting the following needs: (1) analysis, 
whether the current application (given a specific distribution structure) violates dis- 
tribution prerequisites. (2) transformation of an application in a way that it conforms to 
a specific pattern. 

In this paper we present a tool based on graph queries and graph rewriting steps to solve 
the issues above. We shortly introduce a graph schema that allows us to represent the 
architecture of a program as a graph. The information captured in the graph is almost 
identical to that in UMUs class diagrams; in fact, the graph schema definition closely 
resembles the meta model of the class diagram in UML. We show how such a tool can 
work, particularly how architecture analysis and transformation map to graph queries 
and rewriting rules, respectively. Thus, it is possible to document the behavior of the 
tool on a high level of abstraction - an advantage over “conventional” tools. 

In section 2 we introduce a small example program. It will be used as a running example 
to illustrate the need for (and effect of) transformations. We outline the structure of our 
tool in section 3. The main part of this paper will examine specific patterns and their 
specification by means of graph queries and graph productions. Section 5 summarizes 
related work. 



2 An Example 



Let us consider the architecture of an application administering account data as shown 
in Fig. 2. It is depicted as a class diagram in a notation similar to UML [Rat97], as shown 
in Fig. 1. We use the option in UML to define a graphical shorthand for stereotyped 
elements (here dependency relations): Directed lines denote invocations, dashed lines 
create operations. Folder symbols denote packages. 



Elements 



Relationships 



«stereotype» 

(class)name 

attributes 

• • • 

methods ... 



pkg. name 





























class/ 

interface 



package with 

contained 

classes 



association 

1> inheritance 

..stereotype..,^ dependency 

Shorthands 

«creates» ^ 



«calls_»_ ^ 



Fig. 1. Architecture Elements 
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The example system contains two packages: gui (graphical user interface), and 
database. The class MainWindow contains the static method main which instantiates the 
classes Collection and MainWindow. The other classes are instantiated in other methods 
of MainWindow. Therefore, dashed arrows are pointing from MainWindow to all other 
classes - including a self reference. 




Fig. 2. Architecture of the Example 



There are two obvious design deficiencies that prevent a distribution between database 
and user interface: (1) the class Collection should be instantiated only once, but it is 
not modeled as a singleton, (2) MainWindow instantiates Entries directly, instead of 
using ei factory method of the Collection. Upon distribution, the former would lead to 
multiple instances of the class Collection, the latter to a remote instantiation which is 
not supported by current middleware technologies. 

These deficiencies can be detected automatically as shown in [RadOO]. In section 4, we 
will tackle another problem related to distribution: remote classes can only be accessed 
via a separate interface definition. Before we discuss this problem, we will now sketch 
the basic methodology of a tool we have developed. 



3 Methodology and Tool Support 



Up to now, we have not mentioned how the tasks outlined in the preceding section relate 
to graph rewriting. The implementation of our tool employs a graph, representing the 
architecture of the subject system. A transformation of the program is thus identical to 
a rewriting of the architecture graph and additional source code alterations. It is also 
possible to gain information about the system (and potential pitfalls upon distribution) 
by analyzing the graph by means of queries. 
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Our tool is implemented using the high level specification language PRO GRES 
[SWZ99]^ PROGRES features a language to specify a schema and transformations 
that operate on graphs conforming to the schema. 

It is not only a specification language for graph transformations, but it also contains 
an environment which allows to edit and interpret the specification. Besides the inter- 
pretation in the PROGRES environment, a prototype can be generated (C code, user 
interface based on Tcl/Tk). 

The PROGRES schema (which can be roughly compared to the notion of a schema 
in database modeling) specifies the structure of graphs. The nodes in a graph are in- 
stances of types that are specified in the schema. PROGRES uses a two tiered type 
system: Nodes are instances of node types which inherit their properties from one node 
class. Multiple inheritance is possible between node classes. A node class specifies the 
attributes of nodes and the (typed) edges between them. 

The schema serves as a meta model for our architecture description language. Eigure 
3 shows the schema of our architecture language. Inheritance relationships are denoted 
by hollow arrows, open arrows represent labeled edges. 

In our case, the nodes in the graph represent elements corresponding to an archi- 
tecture description language (short ADL). Thus, it is not surprising that the graph 
schema resembles parts of the UML meta model [Rat97]. Here, all nodes inherit from 
ADL_OBJECT. Eor example, ADL_OBJECT has an attribute called name that is inher- 
ited to all other node types. As in the UML, classifier is a superclass for classes and 
interfaces. This is useful because interfaces and classes share many features. Both are 
for example a potential target of a has-method relationships. The relationships are mod- 
eled as first class citizens (using a edge-node-edge construct), particularly to allow for 
inheritance on relationships. The relationships are defined in another part of the schema 
and comprise for example dependencies (sub-classed further into call and create rela- 
tionships) and structural features like inheritance. The source and target nodes of these 
relationships are rather obvious, therefore we do not show them here. Eurther details 
can be found in [Rad98]. 

The node type partition is an extension to a “normal” class diagram. Classes, interfaces 
and whole packages may be attached (represented by a suitable edge) to a partition. A 
partition is a part of the distributed application that contains all elements attached to it. 
It can be executed by a process of the underlying operating system. The notion stems 
from the programming language ADA [Int94]. 



3.1 Tool Structure 

The distribution tool consists of four integrated parts: an analyzer of existing programs, 
an architecture and source code transformation tool and a middleware specific genera- 
tor. The structure is shown in Eig. 4. 

Let us summarize the steps: 

^ http : / /www-i3 . inf ormatik . rwth-aachen . de/research/progres 
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Fig. 3. Graph Schema of an UML-like Architecture Description Language 



0 We gain an architectural view of a software system by analyzing the source code 
and building up a graph structure (reverse engineering). The analyzers we have built 
either use the transformation language TXL [CCH95] or the utility JavaCC (which 
is similar to lex/yacc). They are able to parse Java and -with restrictions- C++ and 
Modula-3 programs and build up a graph representing the system’s structure. 
Structure graphs of Cobol programs could be another source for building up the 
architecture by means of a transformation step described in [Cre98]. 

(2) In this step, we attach types and packages to partitions (“planning of distribution”). 
We then apply graph queries to check whether (distribution) preconditions are sat- 
isfied. Take for example the requirement that caller and callee must be co-located 
either because of a creation or a user annotation to eliminate efficiency bottlenecks. 
For lack of space, we omit a description of this query here. 

@ If we want to perform a persistent transformation of the system, we have to do 
it on two levels: source code and architecture. The latter is described by a graph 
rewriting rule, the former by a sequence of basic source code transformations (e.g. 
adding, renaming, deleting classes or methods). The source code transformations 
are specified on a high level using the JavaTree utility. In the following section, we 
will examine specific graph rewriting rules in detail. 
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Source code, 

distributed 
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Source code 



package gui 
import db. Collection 
class MainWindow { 
public Mainwindow{ 
Collection c = v 

new Collection 0 ; 
c . createNewEntry ( ) ; 

} 

public Paint 0 { 



Fig. 4. Tool structure 



0 Prior to compilation, the source code is analyzed and annotations are resolved. For 
each partition, the necessary parts of the overall application code (computed by 
a graph query) together with generated stubs (required by the middleware) and a 
makefile is written to a separate directory of the file system. 

This generation process also comprises transformations of the architecture in each 
of the partitions. These transformations can also be described consistently by graph 
transformations. In this step, we start inserting middleware specific code into the 
application. Normally we don’t want to keep this code as a basis for further de- 
ployment (because it would be difficult to exchange the middleware or distribution 
structure in the future). This is the reason for the separation between step @ and 
0. We call step @ a persistent, step 0 a transient transformation. 

The generator is integrated with the source code transformation machinery based 
on Java Tree. It currently contains about 18.000 lines of code. The code generator 
was originally developed in the diploma thesis of F. S CHNElDER [Sch98]. 

A sample screenshot of the generated distribution tool (DiTo) is shown in Fig. 5. The 
graph represents the architecture of our running example. 



4 Direct Class Access 



In this section, we present a typical scenario in distributed systems: remote classes 
cannot be accessed directly in most middleware technologies, including CORE A and 
DCOM. They have to use an explicit specification of the interface instead. 
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Fig. 5. Screendump of the generated PROGRES prototype 



Though we only pick this single example, we use a general schema from [RadOO]. We 
will present the context and motivation of a necessary transformation using a schema 
known from design patterns [GHJV95]. Design patterns describe a problem together 
with a solution that has proven to be useful with respect to a certain goal. Please note, 
that we will factor out the description of a problem, because there are eventually multi- 
ple “patterns” that are applicable to this problem. 

The schema is outlined in the following: 

Context and Problem This section outlines a problematic architectural structure. We 
distinguish two different categories denoting the severity of the problem. (1) 
Changes are necessary in order to get a running application, for example due to 
technical restrictions of the middleware. (2) It is not necessary to replace existing 
code, but a standardized (eventually more general) solution to the given problem 
exists. At least in the first case, we will provide a means to detect this situations 
Forces Additional constraints or desired properties — for example high performance 
or throughput — are described here. 

Solution and Structure This paragraph starts with a presentation of the general idea 
behind a pattern. The structure of the pattern is usually shown in form of a class 
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diagram. Additionally, the dynamic behavior may be presented in form of UML’s 
sequence diagrams. 

With the help of the diagrams above, we shortly sketch advantages and disadvan- 
tages of the patterns. 

Variants In some cases, there is no single solution (e.g. a pattern) that fits in all scenar- 
ios. Variants of a specific pattern have slightly different properties and thus might 
be preferred by a developer. 

Getting There This paragraph is not found in the standard literature dealing with pat- 
terns: If a developer starts writing a program from scratch, it is relatively easy to 
conform to the structure proposed by a pattern (though there are a lot of possibili- 
ties for failure). Complying to a pattern is certainly non-trivial if a program already 
exists. The different steps during the restructuring towards a pattern are discussed 
in detail. 

Transformation Summary We present an overview of the sequence of transforma- 
tions. 



4.1 Context and Problem 

Middleware technologies require an explicit specification of the interface of remote 
accessed classes. This restriction in typical middleware technologies is not by chance: 
The caller of an object should not know the concrete implementation it is dealing with. 
The reference of an object via an interface allows for the transparent replacement of 
a “normal” implementation by a stub which delegates invocations to another address 
space. 

The specification of interfaces has to be carried out in a programming language neutral 
form usually called IDL (interface definition language). Different middleware technolo- 
gies employ different IDL variants; in the sequel IDL refers to CORBA-IDL. All public 
methods have to be expressed in IDL. If such a method declaration has superclasses or 
contains other types as parameters, these have to be specified in IDL as well. 

At first, we have to detect whether the “problem” of a direct access to a remote class 
exists. The graph query^ shown in Fig. 6 finds all places, in which a class attached to a 
certain partition invokes a method of another class (denoted by the calisp path) which 
is not attached to that partition (denoted by the “crossed out” path attached). 

The simple analysis in Fig. 6 works quite well if all classes are associated with at most 
one partition. Let us now look at an example in which this rule fails. If the caller (‘1) 
is attached to partition A, and the callee (‘2) to the partitions A and B, the rule can 
not match, because the negative path condition is always violated. However, there is 
a potentially remote call from partition A to partition B. It is tempting to forbid this 
scenario, because it means that a remote instance is invoked, although the class is also 
locally available. This is quite restrictive and a better analysis can avoid this problem. 
This failure scenario is critical, as the analysis rule (in the form of Fig. 6) misses a place 
requiring changes. 

^ We use the technical concept of a path expression: a computed, virtual edge between two 
nodes. Such a path expression may be used in other queries. 
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path remote_call : CLASS -> CLASS [0:n] = 
‘1 => ‘2 in 




end; 



Fig. 6. PRO GRES Path Expression: Eind a Remote Invocation Targeted to a Concrete Class 



Let us take up the considerations from a different point of view: the potential propaga- 
tion of references to object instances. In our approach, the primary source for references 
to objects in other partitions are singletons, because all objects that are dynamically al- 
located via the language primitive new are local objects. Singletons are classes of which 
exactly one instance exists. Our approach features a special treatment of singletons 
that allows the remote access of their instance via a so-called singleton factory (which 
transparently obtains the reference via a naming service). Once an object obtains such 
a reference, invocations of its methods are the source for further (potentially) remote 
references. 




Fig. 7. PRO GRES Path: Eind remote callers (‘1) of singletons 



The path expression in Fig. 7 reflects this rule: it starts with a singleton factory (G) and 
tries to find a caller ('2) in another partition ('4, the rule matches only distinct partitions 
for ‘3 and ‘4). 
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This path yields only the primary remote references. We have to recursively extend the 
number of classes that are remotely referenced. This is captured in the derived attribute 
remote (not shown). The value of a derived attribute is computed at runtime using a 
specified evaluation rule. 

If we have found places, in which a direct class access has to be avoided, there is still 
the problem that the middleware requires a language neutral definition of this interface, 
i.e. CORBA-IDL. Note, that it is not always possible to map an existing interface to 
IDL due to some restrictions of IDL. The main restriction is the ban of identical method 
names which could only be distinguished by the number and types of their parameters 
(so called overloading which is allowed in C++ and Java). Fig. 8 shows a graph test for 
this situation: The path expression getMethod (not shown) is based on the relationship 
has_method but also includes inherited methods. 




Fig. 8. PRO GRES Test: Find a Remote Interface employing Method Overloading 



To summarize, there are two different problems: (1) find places in an application that 
are potentially affected by distribution, i.e. remote invocations. We will have to create 
an additional interface and let some callers indirect their invocations via this interface. 
(2) We have to check whether this (remote) interface conforms to restrictions of the 
middleware. If these are fulfilled, the middleware specific interface (i.e. CORBA-IDL) 
can be generated. 



4.2 Forces 

- Try to minimize the number of methods in a remote interface. It avoids unnecessary 
dependencies between two collaborators. This is an application of the principle of 
information hiding. 

- Try to minimize the amount of classes that have to be specified in an IDL. As we 
have seen in Fig. 8 there are additional constraints on an IDL interface. Coping with 
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these constraints requires changes of the interface and the classes that implement 
this interface. 



4.3 Solution and Structure - Explicit Interface 

Create an interface containing a subset of the public methods of a class c. Objects in 
other partitions have to access the class c via the created interface. Strategies of finding 
a suitable subset are discussed in paragraph 4.4. In order to meet the goal of small 
interface definitions, it can be useful to employ two variants sketched briefly here: 

1. Use multiple interface definitions comprising relatively small subsets of the meth- 
ods representing specific views of a class. This has the advantage that the clients 
might only need one of these interfaces. Thus, they would have less code generated 
by IDL compilers. 

2. Use a wrapper that delegates to the original implementation. The wrapper can per- 
form additional services, for example access control. 

4.4 Getting There 

Before an interface is created, we have to find a strategy to select a subset of its methods. 
The following strategies are supported by our system. 

- Select all public methods. 

- Select all (public) methods that are used by other classes. 

- Select all (public) methods that are used by classes living in another partition, i.e. 
“remote” classes. 

For the latter, the graph test in Fig. 9 is used (the two other cases are simpler subsets 
of this rule). The test relies on the path expression remote_call which we already seen 
in Fig. 6. Given a specific interface, it finds all public methods whose name is used 
by a remote caller. The set of potential remote callers is selected iteratively via the * 
operator in PROGRES. In order to enhance the readability of the specification, we use 
the convention to depict nodes denoting a relationship by squares covering only the 
node number. This is the case for node ‘4. 

In the next step, a suitable interface containing the current selection has to be created. 
This can be done via a straight forward graph transformation (not shown). Beside the 
mere method nodes, the complete signature is copied. 

Once we have an interface, all applied occurrences of the class in question have to be 
replaced by this interface. Fig. 10 shows a graph production that performs this task. The 
left-hand side of the rule finds the set of methods and attributes (‘4) whose return type 
points to the given concrete class (4). The right-hand side redirects the target of the 
retType edge to the interface (‘2). The rule also changes all parameter types (‘5) and call 
relationships (‘3). 
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Fig. 9. PRO GRES Query: Find the Sets of Methods being Used in Remote Invocations 



4.5 Transformation Summary 

Step 0 in Fig. 1 1 shows that we create an interface containing all public methods of a 
certain class. The class inherits from the interface (in Java terminology: it implements 
the interface). 

If only a small subset of the methods is actually called by the remote call, there are two 
potential solutions. 

- As shown in step (g), the remote classes use an interface containing only a subset of 
all public methods. Methods with a package visibility are moved towards a second 
interface being private to the package. 

- Employ a wrapper class (step @) that comes with a restricted interface. Its meth- 
ods delegate invocations to the implementing class. This wrapper might be able to 
convert data types or print out debugging information. 

Please note that the permanent use of wrappers, which might be considered too costly 
in some scenarios, can be avoided. The wrappers can be generated completely by a 
transient transformation if they are actually needed. 



5 Related Work 

There is only a small number of approaches that capture the essentials of a design 
pattern in a way suitable for tools. A major difference to our approach is that we do not 
concentrate on a specific design pattern. 
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production redirectlnvocations( class : CLASS) = 



‘4 : FEATURE 



‘5 : Parameter 





Fig. 10. PROGRES Rule: Replace Applied Occurrences of a Class by its Interface 



ZUNDORF and Jahnke [JZ97] from the university of Paderborn apply graph rewriting 
rules to transform architectures. They also incorporate other technologies into their tool: 
they use a fuzzy reasoning net to detect “bad code” in existing programs. At the time 
[JZ97] was written there was no tool support. But is planned to integrate the ideas into 
the FUCABA environment (From UML to C++ And Back Again, superseded by the 
Java variant FUJABA). 

Another pattern oriented transformation tool has been developed by Florun et al. 
[FMW97] at the university of Utrecht. Their system allows to bind existing classes to a 
role in a pattern or to create new classes by instantiating a pattern. It operates in three 
ways: (1) instantiate a pattern, i.e. generate program skeletons (2) bind classes to roles 
in a pattern, and (3) check conformance to a pattern. 

The tool enables the analysis of Smalltalk programs and is itself written in Smalltalk. It 
uses ei fragment model to capture patterns. In case of the abstract factory, there is a frag- 
ment for the factory and each product it builds. The fragments are represented as pro- 
gramming language entities (Smalltalk classes). The implementation of a fragment con- 
trols the behavior upon transformation: for example, a method of the factory fragment 
allows for adding a product. It takes care of adding the necessary create<productnanne> 
method to the factory. 
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Fig. 11. Sequence of Transformations using Interfaces or Wrappers 



The following approaches are graph-grammar based. Dean and CORDY [DC95] use an 
architecture like language which identifies for example the entities task and file and the 
relationships memory-access and invocation. Thus they can represent system structures 
as graphs. The graph productions are used to characterize systems according to their 
conformance with a given architectural style, e.g. a layered architecture or a pipe-filter 
file style. They do not intend to transform these systems. In 1993, the authors had devel- 
oped an incomplete (no visual representation) implementation of their technique using 
Prolog. 

Metayer [Met96] also uses graph grammars to define architectural styles, but explic- 
itly describes the evolution of an application by graph rewrite rules. Metayer uses 
the example of a client/server style and the addition of a new client. He provides an 
algorithm that checks whether a given transformation breaks constraints of the archi- 
tecture style. Metayer uses a simple, non object-oriented model of an application as 
a set of agents and their interconnections. Apart from this difference to our approach, 
the formalism based on multisets is less intuitive compared to a schema definition in 
PRO GRES. There is no tool supporting Metayer’S approach. 

In contrast to Florun et al, the graph grammar approaches allow for the specification 
of patterns at a higher level of abstraction. A main difference of our approach compared 
to ZUNDORE and Jahnke, is the focus on the detection of “problematic” places inside 
an architecture. The definition of the term “problematic” depends on an overall goal, in 
our case the transition towards a distribution-aware architecture. 

Table 1 compares the different approaches of capturing design pattern knowledge. 
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class and interaction 
diagrams 


class and interaction 
diagrams 


none (manual editing) 


Jahnke / 

ZUNDORF 


Detection of a “poor 
pattern” via fuzzy 
reasoning net 


right hand side of 
production 


graph production 


Florijn 


informal description 


user binds classes to 
roles in a pattern 


inside fragment 
implementation 


Our 

Approach 


graph query detects 
problems 


right hand side of 
production, graph query 


graph production 



Table 1 : Support of Design Patterns 



6 Summary 



We have shown that the application of design patterns can be tackled by means of graph 
techniques, allowing analysis and transformation of architectures. We have applied this 
technique to the problem of preparing an object-based (or object-oriented) application 
for distribution via a standardized middleware. 

The left-hand side of a transformation rule could also be interpreted as an “Anti Pattern” 
as introduced by Brown et al. [BMIM98]. An anti pattern is a solution to a problem that 
should not be used in a certain context. It is interesting that the graph transformation 
rule combines both anti pattern and pattern (on left and right-hand side, respectively). 

In this paper, we focused on operations on the architecture level. These operations are 
implemented by means of graph techniques. As sketched in section 3, the developed 
tool has to employ compiler techniques as well: it has to parse and transform the appli- 
cation’s source code. 

The graphical specifications have the advantage that they provide an intuitive, yet well 
defined means to document and execute architectural transformations. Graph queries 
enable us to examine complex properties of classes and their relationships. Particularly 
they enable the analysis of distribution prerequisites (e.g. a missing factory). Graph 
productions can be used to transform the architecture of an application in a suitable 
way. They formally capture -at least- a part of the knowledge of a design pattern. 
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Abstract. This paper presents a formal approach for managing 
unanticipated software evolution. Labelled typed nested graphs are used 
to represent arbitrarily complex software artifacts, and conditional graph 
rewriting is used for managing evolution of these artifacts. More 
specifically, we detect structural and behavioural inconsistencies when 
merging parallel evolutions of the same software artifact. The approach 
is domain-independent, in the sense that it can be customised to many 
different domains, such as software architectures, UML analysis and 
design models, and software code. 



1 Introduction 

When looking at current-day CASE tools and software development environments, we 
observe that most of them provide poor support for evolution or no support at all. If 
evolution support is provided, it is usually ad hoc and restricted to a single phase in 
the software life-cycle. 

Even version control tools [4] do not adequately deal with evolution. When 
merging parallel evolutions of the same software artifact, the best they can do is detect 
structural inconsistencies in the result of the merge [27]. A number of research 
prototypes exist that can also detect behavioural inconsistencies [2, 3], but these 
approaches restrict themselves to a specific language. 

We believe that it is essential to have a tool that allows us to detect behavioural 
merge conflicts in a uniform way. It should be customisable to software artifacts in 
different phases and different domains without needing to modify the underlying 
formalism or algorithms. In this paper we present such an approach, based on the 
technique of reuse contracts [18,24]. Because this technique for dealing with 
unanticipated evolution has already been customised to a number of different domains, 
including class collaborations, UML class diagrams and software architectures, it 
seems to be a suitable candidate to express our ideas. 

By representing arbitrary software artifacts by means of graphs, and evolution of 
these software artifacts by means of conditional graph rewriting, it becomes possible 
to express the ideas behind reuse contracts directly using the formal properties of the 
graph rewriting formalism. 
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2 Documenting Evolution 

To be able to manage unanticipated evolution of software artifacts, the evolution step 
needs to be documented explicitly in a disciplined (i.e., formal) way. To express the 
specific way in which a software artifact is modified, different types of modification 
can be identified by specifying a modification type that imposes extra restrictions on 
the evolution step. Modification types are fundamental to disciplined evolution, as 
they give us relevant information for identifying merge conflicts when merging 
parallel evolutions of the same software artifact. 




Fig. 1. Documenting parallel evolutions of the same software artifaet allows us to deteet 
behavioural ineompatibilities. 

To explain more clearly how merge conflicts can be detected, consider the example 
of Fig. 1, where a very simple UML class diagram is being modified by different 
persons during collaborative software development. Two parallel modifications are 
made to a Point class containing two attributes x and y, and one operation distanceTo 
which calculates the Euclidean distance between two points (the receiver and the 
argument). One software developer modifies this basic behaviour by reimplementing 
distanceTo so that it calculates the Manhattan distance instead. Independently, and 
completely unaware of these changes, a second software developer extends the 
functionality of the Point class by introducing two new operations getRadius and 
getAngle for working in polar coordinates. Since getRadius corresponds to the 
Euclidean distance to the origin, it can be calculated by performing a self send to 
distanceTo. 

When merging these parallel modifications, a behavioural conflict arises, because 
the getRadius operation does not behave as it should anymore. Indeed, because 
getRadius invokes distanceTo, which now calculates the Manhattan distance, a 
different result is obtained than before (although the coordinates of the point have not 
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changed). Such an unanticipated behavioural incompatibility is called a merge conflict 
(more specifically, an inconsistent target conflict). It cannot be detected by 
straightforward merging approaches, because they do not take behavioural information 
into account. 

While the example above is very simple, in practice the evolving software artifacts 
will be more complex, and many different changes will be made simultaneously. 
Additionally, the merge conflict mentioned above is only one of the many different 
kinds that can arise. This makes it unfeasible to detect merge conflicts manually. 
Therefore, semi-automated tool support for detecting such behavioural 
incompatibilities is essential. 

One should note that the detection of behavioural merge conflicts in general is an 
undecidable problem. Therefore, the best we can do is take a conservative approach 
that gives us a safe approximation of all potential behavioural conflicts. In other 
words, we can only generate conflict warnings rather than actual conflicts. 

The innovative idea of reuse contracts is that potential behavioural conflicts can be 
detected in a very straightforward way, by explicitly documenting each modification. 
For example, the horizontal evolution step in Fig. 1 is expressed by a graph derivation 
Go ^Gi which does not only specify the original version and the evolved version, but 
also formally documents the way in which Gy is obtained from Gq. This 
documentation is given by a graph production ChangeOperation(Point.distanccTo) . It 
specifies that the distanccTo operation is modified in some way. The vertical 
evolution step, described by a derivation sequence Go ^ G 2 which is composed of 
three graph productions. Two AddOperations specify the addition of getRadius and 
getAngle, respectively. AddInvocation(Point. getRadius, Point. distanccTo) specifies 
that the implementation of getRadius performs a self send to distanccTo. It is the 
combination of the vertical Addinvocation and the horizontal ChangeOperation that 
gives rise to the merge conflict. Addinvocation shows that an extra call to distanccTo 
is added, while independently the behaviour of distanccTo is modified in an 
incompatible way. 

In the remainder of this paper we will illustrate that graph rewriting techniques 
provide a very suitable domain-independent mechanism for expressing evolution and 
dealing with merge conflicts. 



3 Formal Foundation 



3.1 Graphs 

Because we want the reuse contract technique to be applicable to many different 
domains, we need to choose a formalism that is general enough, yet still intuitive to 
work with. 

We have chosen to represent software artifacts by labelled nested typed graphs for 
various reasons. Graphs are an intuitive, visually attractive, general and 
mathematically well-understood formalism. The edges in a graph are used to represent 
all kinds of software dependencies, such as data-flow and control-flow dependencies. 
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Types are introduced as a classification mechanism to distinguish different types of 
nodes and edges with similar characteristics. Nesting is used as a means to reduce the 
complexity of a graph, by allowing nodes to contain graphs themselves (cf [9], [21]). 
The labelled graphs we use are similar to those in [7], except that our graph labels also 
contain a set of constraints. Moreover, we require some extra injectivity conditions on 
the node and edge labels. See [19] for more detailed information. 





Fig. 2. Example of a UML class diagram and related collaboration diagram. 
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Fig. 3. Graph representations corresponding to the UML diagrams. 



As an illustration of the graphs we use, consider the UML class diagram and 
collaboration diagram of Fig. 2, which represent part of some graphical object- 
oriented framework. Their underlying graph representations are depicted in Fig. 3. 
Names of classes, attributes, operations and objects in the UML diagrams correspond 
to node labels in the graph. Each of these nodes has a corresponding type, which is 
either «class», «operation», «attribute» or «object». Associations, aggregations and 
specialisations between classes correspond to edges with types «uses», «hasa» and 
«isa», respectively. «/^a»-edges have no labels. Message sends in the collaboration 
diagram correspond to «invokes»-QdgQ^ or «accesses»-QdgQ^ in the underlying graph 
representation, depending on whether they refer to an « attribute »-nodQ or 
«operation»-nodQ. «operation» -nodes and «attribute»-nodes are always nested inside 
a «class»-node or «object»-node. Finally, the «hasa»-edge labelled vertices contains 
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an extra constraint, denoted between curly braces {}, to express a multiplicity 
requirement in the corresponding class diagram (namely that each triangle contains 3 
points as vertices). 

The node types and edge types used in Fig. 3 also have to satisfy some constraints. 
We already mentioned the constraint that an «operation»-nodQ or « attribute »-nodQ 
must always be nested in a «class»-nodQ or «object»-nodQ. Other obvious constraints 
are that «hasa»-QdgQS can only be placed between « class »-nodQS, and similarly for 
«/^a»-edges and «uses»-QdgQS. All these constraints can be expressed formally in a so- 
called type graph [5,6,9]. From an intuitive point of view, the type graph is a 
metagraph which puts extra restrictions on the kind of graphs that are allowed. Type 
graphs are very important to customise our formalism to different domains. For each 
specific domain, a type graph must be defined that expresses the well-formedness 
rules that must hold for all the domain-specific software artifacts. 




Fig. 4. Type graphs ClassTypes and ObjectTypes expressing well-formedness eonstraints for 
UML elass diagrams and eollaboration diagrams. 



In Fig. 4, the type graphs ClassTypes and ObjectTypes for the graphs H and K of 
Fig. 3 are shown. Labels in the type graph correspond to types in the graph. A notable 
exception is nested, which is an edge label in the type graphs, although there is no 
explicit «nested»-\y^Q. As opposed to UML class diagrams, in collaboration diagrams 
we allow «accesses»-QdgQ^ to be put from an «operation»-nodQ to an «attribute»- 
node, and (dnvokes »-QdgQ^ between two «operation» -nodQS (to specify operation 
invocations, self sends and super sends). In this way, the structural information 
expressed in a class diagram becomes supplemented with additional behavioural 
information. 

It should be noted that some constraints cannot be expressed easily using a type 
graph. For example, the restriction that an inheritance (isa) hierarchy should by 
acyclic is difficult to express. As a pragmatic solution, this restriction can be attached 
as an extra well-formedness constraint to the isa-edge in the type graph. 

For a complete formal definition of the graphs and type graphs that we use, we 
refer to [19]. All definitions can be given in a category-theoretical way. For example. 
Graph defines a category with unlabelled graphs as objects and node-preserving and 
edge-preserving morphisms, LGraph defines a category with labelled graphs as 
objects, and partial graph morphisms as morphisms. If, additionally, the morphisms 
are label-preserving, we get a subcategory LGraphL. Typed graphs also form a 
category LTGraph(T) with partial morphisms, which is parameterised with the type 
graph T. For each specific domain, another type graph T is used, as shown in Fig. 4. 
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Each J-typed graph G has a corresponding ZG^ra/7/i -morphism type: G-^T assigning a 
type to each node and edge of G. LTGraph(T) also contains subcategories for label- 
preserving morphisms, type-preserving morphisms, and both. 



3.2 Graph Rewriting 



Since graphs are used to represent software artifacts, graph rewriting is a natural 
choice to represent evolution of these artifacts. The research area of graph rewriting 
has a large mathematical backing [8,10,1 1,12]), while it remains fairly intuitive in use. 
We use the algebraic single-pushout approach towards conditional graph 
rewriting [13,14,15], where application conditions are used to determine when a 
certain production is applicable to a given graph. This is essential to detect merge 
conflicts between incompatible evolutions of the same artifact. 

From now on we will use the word graph instead of labelled typed nested graph, 
because all the definitions presented in [17] are proven in the general category of 
graph structures, which encompasses ordinary graphs, hypergraphs, typed graphs, etc. 

Basically, a graph rewriting is defined in terms of a graph production p: L -^R 
which transforms a left-hand side L into a right-hand side R by means of some 
transformation p. The actual graph rewriting is obtained by applying the 
transformation p in the context of a larger graph G. Therefore, a match m: L -^G is 
needed to specify how the left-hand side L is embedded in G. Given p and m, we can 
define a graph derivation G ^ H by applying p in the context of G. By 
sequentially applying a number of productions, we obtain a derivation sequence 
G^^K 

Mathematically, the result graph H of a. derivation G H is obtained by 
calculating the pushout of p: L -^R and m: L -^G. This definition corresponds to the 
single-pushout approach to graph transformations [17]. The pushout of p and m gives 
rise to two new morphisms />*.• G-^H and m*.- R 

Fig. 5 illustrates this approach by means of an example. We start from a graph G 
that satisfies the type graph ClassTypes of Fig. 4. G contains «c/a^^»-nodes Circle and 
Triangle, which both have «operation»-^VibnodiQ^ circumference and area. 
Additionally, both «c/a^^»-nodes are the source of a «uses»-QdgQ center with as target 
a «class»-nodQ Point. Point contains subnodes distanccTo, x andy. Finally, Circle has 
an extra «attribute»-^VibnodQ radius, while Triangle is the source of an additional 
«hasa»-QdgQ with label vertices. The production p: L-^R factorises the common 
behaviour of Circle and Triangle. Instead of letting Circle and Triangle directly 
access the Point node, a common parent Geo is introduced through which all 
communication takes place. Geo captures the commonalities of Circle and Triangle by 
defining the circumference and area nodes. In this way, redundancy is removed, and 
the design is made more reusable. Note that L does not need to specify the radius 
subnode of Circle, the subnodes of Point or the vertices edge from Triangle to Point, 
since these are not required for performing the transformation. The match m: L -^G is 
a total label-preserving and type-preserving graph morphism. 
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Fig. 5. Example of a graph derivation 

Although it is not apparent from the above example, graph productions are also 
allowed to change the type of nodes and edges, as long as these retypings preserve the 
constraints imposed by the type graph. For example, if we use the type graph 
ClassTypes we can change « attribute »-nodQS to «operation»-nodQS, but we are not 
allowed to change a «class»-nodQ to an « attribute »-nodQ or «operation»-nodQ since 
this would breach some of the edge type constraints. 

In order to formally define merge conflicts, we need the notion of parallel and 
sequential independence. Intuitively, two (parallel) graph derivations G ^^7 ^7 G7 and 
G ^p2, m2 G2 starting from the same graph G are parallel independent if they can be 
applied one after the other. A similar notion of sequential independence says that the 
order in which two productions pj and p2 are applied in a derivation sequence 
G ^pimi Gj =^p2,m2 G2 is irrelevant. Obviously, there is a close relation between 
parallel and sequential independence. Two parallel independent derivations can 
always be sequentialised, and lead to a unique result graph (under certain injectivity 
constraints) which is independent of the order in which the productions are applied. 
This property is usually referred to as the local confluence property, and it is 
essential when merging parallel evolutions of the same software artifact. 

Because the above formalism of graph rewriting is not expressive enough for our 
purposes, we also need to attach application conditions to productions [ 13 , 14 , 15 ]. 
The above properties and definitions that hold for ordinary graph rewriting are 
directly generalisable to conditional graph rewriting. Intuitively, application 
conditions impose additional restrictions on a graph derivation G ^ H. In the case 
of application preconditions, the production p: L -^R can only be applied in the 
context of G if additional constraints, specified by a morphism c: L -^L ’ are satisfied. 
These constraints can be positive, which means that the match m: L -^G must satisfy 
the conditions imposed by c. If the constraints are negative, the match m should never 
satisfy the conditions imposed by c. This can be stated formally by demanding that 
there is no morphism v.- L'-^G that makes the diagram on the left of Fig. 6 commute. 
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As a concrete example, L' on the right of Fig. 6 presents two preconditions that 
could be attached to the production p: L -^R of Fig. 5. A negative precondition states 
that there should be no node with label Geo present in G. It is depicted by a dashed 
striked-through ellipse surrounding the prohibited node. A positive precondition states 
that there should be at least one «hasa»-QdgQ from Triangle to Point. Since both 
preconditions are indeed satisfied in Fig. 5, the conditional production can be applied. 



Fig. 6. Application preconditions. 

To stress the fact that we work with application conditions, we will talk about 
conditional productions and conditional derivations. A conditional production is 
defined as a couple (p: L-^R, ApplCond(p)) where ApplCond(p) specifies the set of 
application conditions attached to a production p; L-^R. For the sake of the 
presentation, we restrict ourselves to application /^reconditions in this paper. 

4 Domain-Independent Formalism for Evolution 

4.1 Modification Types 

Using the formalism of conditional graph rewriting, the modification type, which 
specifies the kind of modification that takes place, is defined as a parameterised 
conditional production. An example has already been shown in Fig. 5. Although in 
this particular example, the production p: L -^R preserves labels and types, this is not 
required in general. Moreover, p: L -^R is a partial morphism, since it is not defined 
for all nodes and edges of L. Some nodes and edges on the left-hand side (the ones 
that are deleted) do not have a counterpart on the right-hand side. More specifically, 
the subnodes circumference and area of Circle and Triangle do not have a counterpart 
in 7?, and similarly for the edges (center, Circle, Point) and (center,Triangle,Point) . 

In order to detect merge conflicts more easily, we restrict ourselves to a limited set 
of possible modifications that can be made to a graph. The following primitive 
modification types are provided: adding a node or edge to a graph (AddNode and 
AddEdge), removing a node or edge from a graph {DropNode and DropEdge), and 
changing the type of a node or edge (RetypeNode and RetypeEdge). The exact 
definition of the primitive modification types is given in Fig. 7. Each modification 
type is parameterised with a number of node and edge labels and types. For type 
parameters, greek letters (co, t), X and (|)) are used. Only negative application 
preconditions are needed. Because of space considerations and to increase the 
readability, these application conditions are not shown in a separate graph L Instead, 
they are mentioned in the left hand-side L inside dashed striked-through ellipses. For 
more details, we refer to [19]. 
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Fig. 7. Primitive Modification Types 



Together, the six primitive modification types can be used to express any possible 
kind of graph modification that does not involve nesting. In Fig. 8, an example is 
given of a primitive modification type pi = AddEdge(e, area, radius, «accesses») , 
where e denotes the empty edge label. It is applied in the context of an «object»-nodQ 
Circle, using the type graph ObjectTypes of Fig. 4. 




Pi 
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Circle «object» 




area «operation» 




radius «attribute» 


< 


circumference 

«operation» 






V 



* 




Fig. 8. Example of a primitive modification type AddEdge(e,area,radius,«accesses») 



4.2 Structural Conflicts 

The above characterisation of primitive modification types helps us to detect merge 
conflicts when merging parallel evolutions G =^pirni Gj and G =^p2,m2 G2 of the same 
software artifact to obtain a combined result graph H. An essential distinction can be 
made between structural conflicts and behavioural conflicts. 

When the parallel evolutions cannot be merged because the resulting graph would 
be ill-formed, we say that a structural conflict has occurred. Typical examples of this 
are name conflicts when the label or type of the same node or edge is modified twice, 
or dangling references when a node is removed while independently an edge to this 
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node was added. Formally, a structural conflict is defined by a breach of an 
application condition, because pi is not applicable after p2 ox vice versa. In other 
words, we have a structural conflict if both productions are not parallel independent. 
In [ 19 ], a complete characterisation is given of the different kinds of structural 
conflicts that can occur when merging two arbitrary primitive modification types. All 
possible conflicting combinations can be summarised in a conflict table in order to 
facilitate conflict detection. 



4.3 Behavioural Conflict Warnings 

Because structural conflicts can be detected by structure-oriented merge tools such as 
the one presented in [ 27 ], we will not discuss them further here. Instead, we will focus 
on another kind of conflicts that -as far as we know- cannot be detected by existing 
merge tools in a domain-independent way. 

When two graph derivations G Gj and G ^p2,m2 G2 do not give rise to a 

structural conflict, they are parallel independent. This means that they can be 
sequentialised to G =^pimi Gj =^p2,n2 H ox G =^p2,m2 G2 =^pi,ni H (where n]=p2"^ # rrij 
and ri2=pi'^ 0 m2). Both cases lead to the same unique merged graph //because of the 
local confluence property for conditional derivations. Nevertheless, the merged graph 
can still contain some behavioural incompatibilities because of unexpected 
interactions between both productions. If this is the case, we say that a behavioural 
conflict has occurred. 

Because it is inherently undecidable to determine whether the merge of two parallel 
evolution steps is behaviourally correct, we can only take a conservative approach 
towards detecting behavioural conflicts. Therefore, our formalism generates conflict 
warnings rather than detecting actual conflicts. As an example, consider Fig. 9 , where 
the graph G in the middle is modified in parallel by two graph derivations G =^pirni Gj 
and G =^p2,m2 G2. The first derivation has already been explained in Fig. 8 . It adds an 
«accesses»-QdgQ from area to radius, to indicate that the area is calculated from the 
radius. The second derivation takes an alternative approach, by deriving area from 
circumference (the area of a Circle can be calculated by integrating its 
circumference). This is represented by a primitive modification type p2 = 
AddEdgc(e, area, circumference, «invokes»). In the result graph // obtained by merging 
both derivations, area suddenly accesses radius via two different paths. Once directly, 
and once by way of circumference. This is clearly not the intention, since both 
modifications were introduced for the same purpose, namely providing an 
implementation of area. This particular behavioural merge conflict is called a double 
reachability conflict. To resolve the conflict, we need to decide which of both 
modifications is the most appropriate, and remove the other one. 
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Fig. 9. Example of a double reaehability merge eonfliet 



Formally, the notion of behavioural conflict can be defined using the category- 
theoretical notions of pushout and pullback. The merge of two graph derivations 
G Gi and G ^p2,mi G2 is defined by the pushout of the two corresponding 

morphisms G-^Gi and G-^G2. A potential behavioural conflict occurs if the 
pullback of the matches mj: Lj -^G and m2: L2~^G is not empty, i.e., if the two graph 
derivations make parallel changes involving the same element. In the example of 
Fig. 9, we see that the pullback contains the node area, which is exactly the node in 
which the double reachability conflict occurred. 

The above definition of behavioural conflict is too coarse-grained, in the sense that 
it does not give much feedback on why there is a problem or how the conflict can be 
resolved. Therefore, in [19] a finer-grained characterisation of behavioural conflicts is 
given, by comparing each pair of primitive modification types that gives rise to a 
conflict. For example, a double reachability conflict arises if we have two 
AddEdges pi and p2 that each add a different edge with the same source and target 
node.^ Similarly, a cycle introduction conflict arises when we have two AddEdges pi 
and P2 that add an edge in the opposite direction between the same two nodes. 
Obviously, merging both modifications leads to the unanticipated introduction of 
cycles in the graph. When dealing with evolution of source code this conflict (which is 
sometimes called unanticipated recursion) is often difficult to detect, especially when 



^ In the example of Fig. 9, this is only the ease if we take the transitive closure of all edges into 
aeeount as well, whieh is a straightforward extension of the formalism. 
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no disciplined approach towards software evolution is taken. Yet another kind of 
behavioural conflict is the inconsistent target conflict illustrated in Fig. 1. 

An alternative to detecting behavioural conflicts by comparing couples of primitive 
modification types is to go and look for the occurrence of graph patterns in the 
merged result graph [19]. This approach has the advantage that it is more scalable, 
because it does not rely on the specific modification types that have been used. 



4.4 Domain-Specific Customisations 

In practice, many unnecessary behavioural conflict warnings will be generated. To 
reduce the number of unnecessary warnings, one can resort to more sophisticated 
conflict detection techniques that take more semantical information into account [2,3]. 
We take a similar approach, by fine-tuning the formalism to different domains, and 
making use of domain-specific knowledge to remove some of the unnecessary conflict 
warnings. 

Customisation of the formalism to a specific domain is straightforward thanks to 
the use of type graphs. In Fig. 2, 3 and 4 we illustrated this by representing two kinds 
of UML diagrams, namely a class diagram and a collaboration diagram, as well as 
their corresponding type graphs. For each domain, we also need to provide domain- 
specific modifications, and specify how to translate them in terms of the primitive 
modification types. For example, we can define AddClass and AddOperation in terms 
of AddNode, and AddAssociation in terms of AddEdge. This allows us to remove 
certain unnecessary conflict warnings, such as the cycle introduction conflict which 
can be ignored in the case of adding associations. Additionally, domain-specific well- 
formedness constraints allow us to capture some of the detected behavioural conflicts 
as breaches of these constraints, thus turning a behavioural conflict into a structural 
one. This is for example the case for a cyclic introduction of «isa»-edges, which gets 
captured by the well-formedness constraint that the inheritance hierarchy should be 
acyclic. 



5 Scalability 

5.1 Nesting 

The six primitive modification types explained in section 4.1 are insufficient to 
describe all possible modifications of a nested graph. Therefore, we introduce three 
new primitive modification types specifically for changing the nesting relationship 
between existing nodes. Promotion can be used to pull a node up one level in the 
nesting hierarchy, and Demotion to push a node one level lower. MoveNode is used to 
move a nested node inside a new parent node at the same level as its current parent 
node. For example, in Fig. 5 we could use MoveNode(Circle, area, Geo) to move the 
area node from Circle to Geo. 

Obviously, the new primitive modification types for nesting give rise to new merge 
conflicts, although we will not discuss them here. 
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5.2 Composite Modification Types 

Because the primitive modification types are too elementary to be practically useful, 
we need to predefine a number of frequently recurring derivation sequences. For 
example, we could define RedirectSource(center, Circle, Point, «uses», Geo) as the 
sequence DropEdge(center, Circle, Point, «uses ») , AddEdge(center, Geo, Point, «uses») 
in Fig. 5. RedirectSource is called a composite modification type. Similarly, the 
entire modification p of Fig. 5 can be expressed as a composite modification type 
CreateSuperclass (Circle, Triangle, Geo) which is composed out of: 

AddNode( Geo, «class»), Redirects our ce(center, Circle, Point, «uses». Geo), 

MoveNode(Circle, area. Geo), MoveNode(Circle, circumference. Geo), 

DropEdge(center, Triangle,Point, «uses»), 

DropNode(Triangle, area). Dr opNode(Triangle, circumference), 

AddEdge( e. Circle, Geo, isa), AddEdge( e. Triangle, Geo, isa). 

The advantage of composite modification types is that they allow us to fine-tune the 
conflict detection mechanism. It becomes possible to disregard certain behavioural 
conflict warnings if they occur in certain composite modification types by making use 
of the fact that its primitive constituents always appear in a particular combination 
with each other. 



5.3 Normalisation 

Another way to scale up the approach is by introducing a normalisation algorithm. It 
has two important purposes. First, it removes redundancy in an arbitrary evolution 
sequence, such as a node that is added and removed again. Second, it rearranges all 
derivations in the sequence in a canonical form, by putting all modifications of the 
same type together in a certain order. Formally, the algorithm is based on the notion of 
sequential independence of conditional graph derivations. The current implementation 
uses some kind of enhanced bubble-sort algorithm. 

Normalisation has many advantages. It compacts arbitrary evolution sequences, 
thus reducing space and complexity. Another side-effect is that less unnecessary 
behavioural conflict warnings will be generated during conflict detection. Finally, the 
canonical form of the resulting derivation sequence is easier to understand, and allows 
us to answer questions like “Is node v removed during this particular evolution 
sequence?” in an efficient and straightforward way. 



6 Conclusion 

6.1 Summary 

In this paper we explained how the formalism of reuse contracts could be defined on 
top of conditional graph rewriting with labelled typed nested graphs. This made it 
possible to deal with evolution of software artifacts in an intuitive and scalable way. 
More specifically, the approach is useful to detect structural and behavioural conflicts 
when merging parallel evolutions of the same software artifact. 
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An essential feature of our domain-independent formalism for software evolution is 
that it can be customised relatively easily to particular domains of software 
development. It suffices to specify the domain-specific type graph, express the 
domain-specific modification operations in terms of primitive or composite 
modification types, and determine which behavioural conflicts may be disregarded in 
particular situations. We illustrated our ideas on UML class diagrams and 
collaboration diagrams, but the approach is generally applicable to any other domain 
of software development where evolution is important. 



6.2 Related Work 

Because our approach provides support for detecting conflicts when merging parallel 
modifications of the same software artifact, it can be seen as an extension of existing 
merge techniques. Commercially available merge tools work on a purely textual 
basis [22], not taking into account any syntactic or structural information imposed by 
the programming language. As a result, they only detect physical conflicts, where the 
same line of code is modified in parallel by different software developers. Some 
alternatives take more structural information into account, such as the visual 
differencing tool for UML that comes with Rational Rose, and the domain- 
independent tool proposed in [27] which works with abstract syntax graphs. None of 
the existing tools seem to deal with behavioural conflicts, because this requires more 
semantical information, which is not considered due the additional complexity it gives 
rise to. One notable exception is [2], where a language-independent formalism is 
proposed to merge changes of programs based on the semantics rather than the 
concrete representation. Compared to our approach, the formalism proposed there is 
significantly more complex, and because of its abstractness it is unable to diagnose 
and locate conflicts between changes in the concrete representation of the program. 

In [28], a category-theoretical approach towards software evolution is given. 
Although it doesn’t specifically use graph rewriting, it contains some similarities to 
our work. However, the approach restricts itself to software specifications only, and 
doesn’t discuss the important topic of merge conflicts. 

There are many transformational approaches to describe the evolution of software 
artifacts. For example, [1] proposes a number of operations to transform object- 
oriented database schemas, while [21] proposes a number of behaviour-preserving 
refactoring transformations for object-oriented applications. All these approaches, 
however, are dedicated to a specific domain, and do not deal with merge conflicts. 

Another related area of research that relies on graph rewriting is dynamic evolution 
(or reconfiguration) of software architectures. While graphs are used to formally 
represent architectural components and their interconnections, graph rewriting can be 
used to manage dynamic changes or reconfigurations [16,20,25,26]. An attempt to 
apply our approach to software architectures is undertaken in [23]. 




Conditional Graph Rewriting as a Domain- Independent Formalism 141 



6.3 Future Work 

Although we have already performed some basic experiments, we still need to validate 
our work in the context of a large industrial case study. Other necessary tasks are the 
integration of our approach in a CASE tool, and using our formalism for creating more 
sophisticated version control tools. 

While the formalism explained in this paper is very promising, it can still be 
augmented in many ways. From a language point of view, one useful extension would 
be to add an encapsulation mechanism to nested graphs, to specify which nodes and 
edges are visible to the outside [9]. A parameterisation mechanism could also be 
introduced, to deal with concepts like template classes or template methods. 

Another way to enhance the expressiveness of graphs would be to use hyper edges. 
On the one hand, this would allow edges that have more than one source node and 
target node. On the other hand, it would allow us to nest graphs into edges as well. 

The type graphs we use are restricted to one level only. A slight generalisation 
would allow us to define type graphs of type graphs as well, and so on ad 
infinitum [9]. An interesting practical application of this would be to detect 
inconsistencies in UML diagrams when the UML metamodel itself evolves. To 
achieve this it suffices to apply our formalism on the level of type graphs. 

For each of the generalisations proposed above, it needs to be checked whether 
they still preserve the formal requirements needed for our approach. Basically, this 
means that the definitions of pullback, pushout, parallel and sequential dependence, 
and the confluence property should still be valid. 
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Abstract. 



Many different reasons have induced researchers to develop languages exploiting 
visual representations. Visual elements, two-dimensional parsers and, more generally, 
new language grammars (of different kinds) were suggested and implemented in these 
last fifteen years, both formally and experimentally, depending on the background of 
the authors. After indicating targets and motivations for research on visual languages, 
a few taxonomies will be considered. Two examples of visual languages, together 
with the point of view taken by their originators, will also be provided as well as some 
important steps in the progress made along these years. Finally, the open questions 
and future research directions will be given, together with an indication of the 
principal events which act as international windows on longtime discussions relevant 
for the design of new and, more effective, visual languages. 



1 Introduction 

Visual languages have been present ever since men tried to communicate by gestures, 
drawings and symbols. The interest in such languages increased due to the fact that 
computers can now use different data structures (like text, numbers, images, sound) 
on which computations may be performed so that, it has been argued, even picture- 
like sentences (called visual sentences in the technical literature) of novel languages 
can be interpreted and executed. 

Moreover, these last ten years have seen a change in the way people use 
computers. On one hand, the computer has become a communication device (it 
suffices to say that e-mail is one of the most used application programs of Internet) 
and on the other hand, most commercial programs are used interactively, i.e. as 
dialogue sessions between the user and the different stored applications. In this way, a 
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constant (mostly visual) feedback is necessary to ensure that the user is always 
controlling his/her application, that he/she may switch from one environment to 
another (in a multi-modal way) and, more significantly, that he/she may manipulate 
large quantities of information reliably, efficiently and effectively. In my opinion, the 
increase in computer usage (both quantitative in terms of the number of users and 
qualitatively considering the different computing tasks) provides the fuel that feeds 
research on visual languages. 

When trying to develop languages which are based on pictorial symbols, we must 
provide a syntax, semantics and pragmatics since every sentence of any language is 
used with some conventions in mind (pertaining to a group working in a given 
application domain). To complete the different levels which must be fully covered in 
order to be able to produce a useful visual language, interaction and reasoning must 
also be supported by such language as will be seen in the sequel. 

A discussion-list on the Internet, moderated by David McIntyre [1], which has now 
been dismissed, debated problems related to the motivations, definitions and 
paradigms in connection with visual languages. 



2 Defining visual languages 

A broad definition, given many years ago, claims that a visual language manipulates 
visual information, supports visual interaction and allows programming with visual 
expressions (the latter is taken to be the definition of a visual programming language). 
As seen here, a visual language can be a programming one but not necessarily so, in 
fact the American Sign Language (ASL) which is used by deaf-mute impaired 
persons, belongs to a class of visual languages that are not used for programming but, 
as with any other language, for expression and communication purposes. We may 
also add for interaction between persons, and for information exchange. A visual 
language may be seen as a set of spatial arrangements of text/graphical symbols, with 
a semantic interpretation, that is used to carry out communication actions in the real 
world. 

Another broad definition considers a visual programming (VP) language as any 
system that allows the user to specify a program in two (or more) dimensions. 
Conventional textual languages are not considered two dimensional since the 
compilers or interpreters process them as long, one-dimensional streams. This 
explains why there are two distinct approaches to the formalization of visual 
languages: some start from one dimension extending it to a second one, whilst others 
consider the two-dimensional expressions (also called visual sentences) as the basic 
elements which must be parsed, interpreted and executed on a computer. We will be 
considering visual programming languages which form the core of the work, within 
computer science, for the visual language community. 

Other, different authors prefer the term "executable graphics" instead of visual 
programming languages; still others consider "visual programming language" as a 
misnomer: it either means a programming language which we can see, which is 
trivial, or a language used for programming the behavior of visual things, which is 
limiting. 
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Paul Lyon has coined the term ” Hyperprogramming'' which better summarizes the 
capabilities and support provided by visual programming languages: both theoretical 
as well as practical issues are involved. The theoretical arguments relate to the need of 
formal syntax and semantics and of two dimensional parsers while the practical issues 
include the availability of sufficient computing power to support the capture and 
processing of visually expressed diagrams, the representational power of the visual 
elements and their spatial organization. 

Visual languages were designed having different applications in mind (system 
modelling and monitoring, software development control, etc.) so giving rise to a 
variety of graphical notations: diagrams, graphs, charts,... 

Within the different approaches to use rules for generating visual programs [2] we 
must distinguish between the productions of a grammar to derive all the language 
sentences from a given axiom and a transformation system (having no initial axiom) 
using a set of rules for transforming input facts into a convenient state. This last 
system is also called icon rewriting and has different versions depending on the 
workspace structure, the pattern matching mechanism, the granularity of the rules, the 
available spatial relationships and their coding, etc. 

As a relevant point, we should mention that the formalisms for defining the 
language syntax use textual notation instead of a graphical one and for this reason the 
graph transformation approach appears promising since it provides a visual 
representation of the syntax formalism. Both visual programming environments and 
two-dimentional parsers may be based on graph transformations ensuring syntactic 
consistency as well as semantics. Systems developed exploiting graph transformations 
are a generator of visual language editors (GenGEd) which allows the specification of 
pictorial objects and syntax rules and another system, called DiaGEn suporting 
syntax-driven editing allowing complex interactions; it can be used to create a 
graphical editor from a formal specification. 

A good example of a language based on a transformation system (dating nearly 
twenty years ago) is PROGRES [3]; its acronym derives from PROgrammed GRaph 
REwriting Systems providing tools for the software lifecycle from analysis to 
programming including software reuse. Guidelines for this project were to use 
graphical syntax without excluding a textual one (if less ambiguous), impose a 
precisely defined syntax and semantics, provide declarative language elements for the 
graph views, derived attributes and integrity constraints, separate data definition from 
data manipulation using class diagrams to provide graph type definitions. Last but not 
least, integrate the declarative paradigm with the imperative one of textual procedural 
programming languages. 

A recent visual language was designed along the same school of thought, i.e. graph 
transformations, in this case combined with Java™, it contains both graphical as well 
as textual elements. AGG [2] is the name of this system, it uses a specific 
programming paradigm (graph transformation) combined with the object-oriented one 
also aiming at distributed systems. The authors claim that such system offers the 
possibility of software validation since it is based on a formal basis, it supports graphs 
and rules structured into graph grammars and also useful programmer functions like 
editing, interpreting and debugging. Eor a comparison of PROGRES with AGG see 
Eig. 3.23 in the above quoted reference [2] on page 166. 
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The advantage of combining visual concepts with the textual programming 
language Java^M is the one of exploiting existing libraries and traditional 
programming experience. In brief, AGG is a general-purpose graph transformation 
based language. 

Unfortunately, graph transformation based languages also share, with other visual 
languages, the same problems of lack of visibility, few commercial tools, scalabihty 
difficulties and screen estate organization. Perhaps some human controlled 
experiments on usability for these visual languages could reveal, if properly 
conducted, which are the main gaps between the visual aspects - as shown on the 
screen - and human understanding as deduced by user mistakes, misinterpretation, etc. 

As we can see, no single definition of VL is able to encompass all the properties a 
visual programming language should have. Nevertheless, some interesting visual 
computation models have been proposed which fully describe, at a reasonable 
abstraction level, what a visual programming language does, how it is perceived by 
the user (cognitive aspects), how the computer interprets (or compiles) and executes 
those visual statements. 



3 Classifying visual languages 

Visual programming languages may be classified according to many different 
features: the type and extent of the used visual expressions, (icon-based languages, 
form-based languages and diagram languages), the endorsed features, their 
implementation issues, the overall purpose of the programming language, the formal 
framework which characterizes such languages and the corresponding software 
engineering issues. 

At the beginning of work done on visual languages (at the first IEEE Workshop on 
Visual Languages [4], in 1984 and onwards) there were many discussions as to which 
were the main features of such languages and a first issue arose as to the possibility 
that visual programming languages would/should have their own paradigm. Since 
some projects dealing with visual structures on a screen had a data-flow paradigm, 
some authors claimed that this paradigm was embeded in the visual language concept 
yet other languages were essentially procedural and the representation aspect was not 
directly connected to the execution style of the program; for these reasons no specific 
paradigm can be attached to the visual programming languages. Some have suggested 
to consider a multi-paradigmatic style since different styles may co-exist also 
depending on the implementation of the interactive facilities (event driven, object 
oriented, etc). 

In order to consider how to classify visual programming languages (VPL), a good 
starting point are the considerations on different possible taxonomies described in [5] 
on 1990. In fact, there are many ways to classify visual languages, the author 
considers three different dimensions for the programming language space: non 
example-based - example-based, interpreted - compiled , textual - visual. 

A more articulated and updated taxonomy, devised and supported by M. Burnett 
[6], can be seen in Eigure 1. Seven different, but related, areas exist in this taxonomy 
and they are listed here. 1) Environments and Tools for VPL, 2) Language 
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Classifications, 3) Language Features, 4) Language Implementation Issues, 5) 
Language Purpose, 6) Theory of VPL, 7) Software Engineering Issues for VPLs. 

VPL-I. Environments and Tools for VPLs 
VPL-II. Language Classification 

A. Paradigms 

1. Concurrent languages 

2. Constraint-based languages 

3. Data-flow languages 

4. Form-based and spreadsheet-based languages 

5. Functional languages 
h.Imperative languages 
T.Logic languages 

8. Multi-paradigm languages 
9. Object Oriented languages 
lO.Programming-by-demonstration languages 
ILRule-based languages 

B. Visual Representations 

LDiagrammatic languages 
2.Iconic languages 

3. Languages based on static pictorial sequences 
VPL-III. Language Features 

A. Abstraction 

B. Control flow 

C. Data types and structures 

D. Documentation 

E. Event handling 

E. Exception handling 

VPL-IV. Language Implementation Issues 

A. Computational approaches (e.g. demand-driven, data-driven) 

B. Efficiency 

C. Parsing 

D. Translators (interpreters and compilers) 

VPL-V. Language Purpose 

A. General-purpose languages 

B. Database languages 

C. Image-processing languages 

D. Scientific visualization languages 

E. User-interface generation languages 

E. Languages for programming web-based applications 
VPL- VI. Theory of VPLs 

A. Eormal definition of VPLs 

B. Icon theory 

C. VPL design issues 

D. Human-oriented issues 

VPL-Vn. Software Engineering Issues for VPLs 

Fig. 1. A classification of visual programming languages [3] 

Other authors have considered five major features for subdividing visual languages 
into different classes: visual alphabet, visual syntax, interaction structure and 
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coverage which, in some way, overlaps other previously defined categories but also 
introduces an important feature which is the interaction machinery of the language 
and this, we know, is very important to take advantage of exploration and 
communication in computing. A very good repository for books, articles and research 
projects in visual languages is [7] which is supported and constantly updated by 
Bertrand Ibrahim of the University of Geneva (Switzerland). 

From the point of view of the visual language grammars, Marriott et al [8] consider 
expressiveness and parsing complexity as two important features for any 
programming language. 

Perhaps, the most important characteristic, both from the computational (machine) 
and cognitive (human) points of view, is the way in which information is represented 
through its (static and dynamic) syntax and semantics, and the different data types, 
states and transitions of the system within given application domains. In fact, the 
contents of the visual display, the elements of a visual language, are visual 
expressions of such language (including the background) and must be defined in 
terms of what is represented, how it is represented and mapped to the original object. 



4 From text to pictures 

As we have just seen, the main feature of a visual language is linked to its ability to 
manipulate graphical structures, be they graphs, icons or pictures, so performing 
computations under user control. It is very difficult to have a lexical ordering - or any 
kind of ordering - of graphical structures or, as called by some authors, bi- 
dimensional visual structures (they could also be three-dimensional). In fact, even 
when analyzing text, we may find that a sentence may be "decorated” by using styles 
(different fonts, underlining, bold and italic, different sizes, etc.) and, on top of this, 
we may also annotate text (with handwritten notes, arrows, ticks or typo corrections 
for a publisher) on a galley proof: for these reasons some authors do not distinguish 
between graphical representations and text. 

As we may see in Figure 2, we have bold fonts, underlining, enlargement, in the 
first sentence, followed by a typographical symbol indicating new paragraph. Next, a 
shadowed font has been used for three important words but we could add handwritten 
pencil marks, color, etc. An interesting concept, introduced by Norman [9] assigns an 
affordance feature to the represented objects, distinguishing between realistic 
affordance and perceived affordance. The first concerns the relationship between the 
represented object (for instance an icon) and what such object - in the real world - is 
able to do (a wheel turns, a hammer bangs, etc.) while the second affordance, which is 
the most interesting in this context, is what the user is inclined to interpret by just 
looking at the icon (an airplane silhouette pointing to the ground may induce a 
traveller to think of aircraft maintenance, airflight arrivals, air personnel available for 
passengers, etc.). 

Since we are surrounded by cellular phones (at least in Italy!) it may be interesting 
to note that a pictorial representation of the different available menus, in a Nokia 3110 
cellular phone (refer to Figure 3), may help the user in understanding the functions 
he/she may access and use, particularly so when he/she is within a loop of options and 




Visual Languages: Where Do We Stand? 



151 



ignores the exit procedure. In fact, a two dimensional diagram showing all possible 
operations is far better than a sequential handbook of 70 pages just to make a phone 
call! 



This house is in a terrible mess but we do not despair ! We hope to 
repair the roof and the door as a first step towards obtaining a comfortable living 
place. 



Fig. 2. Text as a picture 
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Fig. 3. The Nokia 3110 cellular phone: a graph description of its features 
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As a first two-dimensional extension of text, we have forms, a matrix of cells is a 
typical example of a class of documents. Many years ago 1984 Shu suggested a 
forms-based language [10] called FORMAL in the realm of visual languages. Her 
main motivations were that 1) a form is a good candidate for an underlying data 
model, 2) her language was very powerful for data manipulation within forms and 3) 
that all the familiar concepts of form filling would ease the understanding and usage 
of this new language. 

In fact, forms have also been the basis for spreadsheets which may be seen as a 
first case of visual programming since the computations to be performed depend on 
the position of the data along the cells of the form. The most typical form of visual 
interhuman communication is through diagrams (sometimes hand-made on cocktail 
napkins at parties [11]) where boxes are inter-connected as if wires would be 
physically present. Boxes and their interconnections, frequently used in block 
diagrams, may be very well represented by graphs, where nodes stand for the boxes 
and arcs for the wires; graphs have been well studied for a long time and many 
algorithms on/for them are known; this is why - I believe - there is a particular 
attention to visual languages from graph experts, particularly in the community of 
graph grammars. 

The interest in graphs as visual components of a computation system is manyfold: 
the possibility of using graphs as a reasoning tool, the desire to develop algorithms for 
pretty layout of graphs (without intersections and spaghetti-like windings), the hope 
of developing efficient parsers for graph structures which generally have no a-priori 
ordering and, last but not least, the desire to perform automatic mappings between 
application domain objects and functions into a visual domain made of icons and 
operations. 

As we may see from the chosen icons (shown on Figure 4), they are clearly 
representative of data classes which can be further specified by the user since no 
automatic recognition from the system is expected. Icons as shown here do not belong 
to the visual alphabet of a given visual language, they help the user in managing his 
virtual desktop as pioneered by Apple™ machines. The pin is generally used for 
putting notices on a board so that such labeled folder may contain timely prompts, the 
file cabinet stores data, the book can hold personal articles, the folder may contain 
other folders as well as files and, finally, the clock is a time function and may display 
hours, minutes and seconds according to preferences which establish which time we 
are interested in. 




Fig. 4. Standard icons for representing clerical work 



Finally we arrive to pictures (sometimes called images since they are the digital 
version of the analog, natural originals). The chosen image. Figure 5, could depict a 
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given application domain, significant object/s in that domain or the activation of a 
program having actions connected to farm animals; at the same time, without any 
furher clue, it could depict anything at all or... everything, it is a world of its own. 




Fig. 3. An example of naive painting: it could mean anything! 



5 An example of a visual language 

There are now, in the literature, many proposals for visual languages, a few have 
become commercial and some of them are rigourosly defined by a formalism: 
grammatical, logical, algebraic and hybrid, fully reviewed in [12]. It may be more 
productive to illustrate, by means of two specific languages, some of the features that 
characterise a visual language. 

The basic idea is to visually represent (pictorially), both programs and their 
execution (instead of using a conventional text for the coded program). The pictorial 
syntax must rely on topological relations so as to facilitate the detection of errors in 
the program but, as a more important issue, the visualization of the program execution 
should enable the user to understand the computation process and the achieved 
results, whether partial or final. 

An interesting example of a success story with a visual language is Lab VIEW™ 
[13] of National Instruments which has been aimed to signal processing and is a 
commercial product since its inception in 1986. The data flow paradigm was chosen 
and the visual components of the language schematize the real laboratory devices that 
are generally used in a signal analysis environment: signal generator, signal analyzer, 
signal filters, osciloscope screen, etc. Two panels are available to the user, a left one 
with the front of a given instrument and a right one with the block diagram showing 
all interconnections. The left panel is intended for showing graphically the input and 
output of the procedure emulating the laboratory instruments, it contains knobs, 
sliders, strip charts, x-y graphs. Lab VIEW provides a broad selection of predefined 
function boxes which include arithmetic and trigonometric functions, string 
manipulation routines, statistical analyses, matrix operations, etc. but programmers 
may also build their own functions. Various control boxes implement EOR loops, 
WHILE loops and CASE statements while Lab VIEW data types include Booleans 
and arrays of Booleans, real numbers, strings and arrays of strings. 
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The success of LabVIEW, which has had many releases, improving and 
generalizing on the original project, is due to the good match between the chosen 
visual metaphors on the screen and the represented objects, on the good object 
organization on the screen and on the natural connection between ’’boxes” that is 
mentally close to what an experimenter would find in a real signal processing and 
analysis laboratory. 



6 Another example of a visual language 

We turn now to another interesting visual language. Prograph [14], where an 
integrated view is taken with respect to the use of visualization for program coding 
(algorithms) and for data representation, since in some cases (comments, labels, 
numbers) text is chosen instead of graphics. 

The authors support the idea that character-based input and textual specification for 
programs was dependent on the typical available hardware, particularly with respect 
to the sequential nature of the underlying computational model. It is for this reason, 
that a new visual language should not be influenced by the present hardware while, at 
the same time, it should exploit the most recent technological features (like high 
resolution, vast number of colours, refresh speed, larger and cheaper memories, etc.). 
At the same time, processes which are inherently sequential should be represented in 
the same fashion like keywords, punctuation, etc. while visual representations of 
windows, icons and palettes should be presented and operated consistently both with 
respect to the operating system as well as for all the different applications. Moreover, 
complex data should be visualized so as to preserve any present structure using 
appropriate graphics including different levels of abstraction. 

A visual language should be built by means of simple mouse clicks and very few 
characters, within a syntactic graphical editor. For instance, pulling a graph from one 
of its nodes for improving its visual appearance should preserve the graph topology. 
The interpretation of mouse clicks shold be context-dependent, both to enrich the 
repertoire of possible actions and to enable the system to trap errors and ensure only 
correct actions. 

Prograph is based on the object-oriented programming style and is able to represent 
classes, attributes and methods; it uses the dataflow programming paradigm. Prograph 
models the basic computational processes such as parallelism, sequencing, iteration 
and conditional execution. The formal description of this language can be found in 
[15]. 

The Prograph environment is made of three components: 1) an editor, 2) an 
interpreter and 3) an application builder. The first enables to write/draw the program, 
to define the data to be attached to the Prograph elements (classes, attributes, 
persistents, methods, cases and instances). The second, error- corrects the program and 
helps in debugging, it also runs the application, and the third helps the user to build 
his own graphical interface for the program. 

Classes in Prograph (system classes, distinguished by a double bar under the 
classes icon, and user classes) are displayed in a special classes window (see Figure 6) 
containing visual representations of the current forest of classes, each one depicted by 
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an icon, all of them interconnected to display their parent-child relationships. System 
classes have an a-priori hierarchy providing interface features like windows, menus, 
dialogs, buttons and lists for an application. There are also primitive methods (which 
apply to system classes) like copy, cut and paste strings of text from a text instance 
and a clipboard. A class may have attributes (class attributes) which are invariant for 
all instances of the class and others, which have distinct values for individual 
instances, instance attributes. Class attributes are illustrated in the next Figure 7, for 
the classes Book and Index Entry. Each class icon contains a left part (named triangle) 
with attributes and a right part (with methods), each class will inherit both attributes 
and methods from its ancestor classes and may have additional attributes and 
methods. System classes have a double line at the bottom of the icon. 




A program example in the browsing domain (to be used on ACM reviews in this 
case) is reported by the authors allowing the user to import a review and edit the text 
file as desired. The default value for initializing the attributes of the Book class is an 
empty string (”.. ..”), a horizontal line separates the class from its instance attributes. 

Methods can either be class or universal methods, which also include built-in 
Prograph primitives. Class methods are named icons in a methods window for the 
class and user-defined universal methods are represented by icons in a special 
Universal window. A method is a sequence of cases where each case is a dataflow 
structure having data inputs, data output, a set of operations and connections between 
them. 








156 



Stefano Levialdi 



The cases are graphically illustrated (on Figure 7) within a window: Input is by 
roots (at the top) and output is by terminals (at the bottom); the arities of both are the 
numbers of roots and terminals respectively and all cases of a method have the same 
input and output arities. An icon represents the operation within the case with its 
textual name. Operations within the case are connected by data links; data values are 
untyped and a terminal or root of an operation indicates the action of copying a data 
value between the calling operation and the associated method. 




Fig. 7. Class Index and method Index/Sort 

The first visual symbol (icon) corresponding to the input (roots) copies the value 
from a terminal of the calling operation while the second one is the output (terminal) 
and copies the value into the root of the calling operation. The Quicksort user method 
is represented as a simple method, a constant 256 on a terminal. If the value on the 
terminal is not NULL then we jump to the next case. The output value of persistent 
Reviews in root, the output new Index Entry instance on root. The output value of the 
attribute key of an input instance on the right root, for left input instance, sets the 
value of attribute review to the right input and output instance; finally a local call is a 
call to an inner, locally defined method, which cannot be called by other operations. 

The execution of the Prograph program begins with a call to the method and passes 
the input data to the input roots so as to allow execution of the first case in the 
method’s case sequence. Such execution is data-driven via the dataflow paradigm, i.e. 
it starts as soon as data is available. The otuput from a case is available only when 
execution halts. The method Index/Sort (Figure 7) sorts a list of index entries and 
displays the sorted values; firstly we have an input (copying values from the terminal 
of the calling operation), the expected data is an instance of class Index, next we have 
a get attribute which inputs the Index instance and outputs it together with the value 
of the named attribute entries (a list of Index entry instances). Finally, the value of 
entries is passed to the sorting method Quicksort which outputs elements of entries 
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sorted on an attribute key, this output goes to update the value of entries and finally, 
the class instance is passed to /Build value list which builds the new display list of 
entries as shown on the next figure. The case and therefore the method Index/Sort 
terminates with no output. 

As we have seen by this brief view of Prograph, although essentially pictorial, the 
language uses text in some restricted way to specify data and to remind the user about 
which program segment is being represented inside the window. Some useful 
mechanisms, not well expressed by text, like multiple inheritance, are better seen by 
means of a graph (more parents for one son implying an acyclic graph rather than a 
tree). Prograph supports parallel processing and is formally described by means of 
first order logic predicates. 

We have then seen that two different approaches to the design of visual languages, 
LabVIEW and Prograph are both inherently pictorial, nevertheless it is always 
important to fully understand and have some practice with the visual counterparts of 
the visual components, otherwise it may be difficult to grasp the meaning of all the 
graphical symbols, their interconnections and their computational role. 

As far as we have gathered, there are two main problems to be solved when 
designing a visual language: 1) the choice of the language elements with respect to 
their representative value, operators and data (lexicon, syntax) and 2) the definition of 
the users (to make the language semantics tuned to the human interpretation). Finally, 
we also want computers, (the corresponding compilers) to understand, in the same 
way as humans, visual sentences in that language. 



7 Human-program interaction 

Different authors have considered the importance of human beings (the users) when 
designing visual languages rather than only confining to the Computer Science area, 
particularly to computer programming languages of which, visual languages could be 
an extension. In [16-17] the authors approached the formalization of a visual language 
from the idea that a language supporting interaction should be based on visual 
sentences to be interpreted both by humans and compilers in exactly the same way. 
According to this approach, the state of the computation may be described as a string 
of attributed symbols, while the image is seen as a composition of structures 
meaningful for a user working in a specific application domain, in fact, a so-called 
computer-communication (com-com for brief) model may be considered as in the top 
part of Figure 8. 

A very similar approach has been followed in [18]; its correponding computational 
model is depicted at the bottom of the same figure. Two different "worlds” are 
interacting: on the left we have humans, the users, while on the right we have 
programs being executed on a computer; the middle component is the monitor screen 
or visual display on which messages can be generated (by both users and programs) 
and seen (by the same partners). People perceive such graphical and textual 
information and edit it according to their needs, imagination and experience. On the 
other hand, this information must be parsed and interpreted by a program(s) and then 
recreated, manipulated and sent back to the screen: a typical interaction between users 
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and programs when a job is being done is made of a series of these cycles. For these 
reasons the issues introduced by the authors for a candidate visual language taxonomy 
were: a) representation of information, b) cycle of interaction and c) evaluation from 
both partners: humans and programs. 




Image materialization 




\ 




Image 

interpretation 

Computer 




Fig. 8. Computing/communicating model for HCI (top) [17] and Narayanan’s view of 
program-human interaction (bottom) [ 18 ] 



Along the lines described in the first approach, a visual human- computer 
interaction [19] is based on the identification of characteristic structures (cs), for 
example sharp corners or holes in closed contours of digital images. Such structures 
are associated to attributed symbols which describe geometrical and topological 
properties as well as the interpretation of the characteristic structures in the context. 
The triple formed by a characteristic structure (cs) - pictorial part - an attributed 
symbol - description part - and a relation between the two, is called the characteristic 
pattern, or cp for short. 

Informally, a characteristic structure (or structure for short) is a set of pixels 
appearing on the screen, which constitutes a perceptual or functional unit: hence a cs 
is a set of pixels to which a meaning is associated. The support of a cs is the set of 
coordinates of the pixels belonging to it. An attributed symbol (or symbol for short) is 
a tuple which expresses the meaning associated with a cs; it lists the type t of the cs of 
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which the symbol is a description, and the properties assumed by some attributes for 
the specific instance being described. We may say that the cs is the materialisation of 
the attributed symbol and the attributed symbol is the interpretation of the cs. Since 
many structures can be simultaneously identified in an image, the image is then a set 
of structures. The set of characteristic patterns formed by such structures, their 
meaning and the relations between structures and meanings, uniquely characterize a 
visual sentence. 

More precisely, a visual sentence (vs for short) is a triple <i, d, <int, mat» where 
i is an image, d is a description (previously called a string but now, being order- 
independent will be a set of symbols) int a function (interpretation) mapping the 
structures of i into the symbols describing them and mat a function mapping the 
symbols in d into the structures in i. 

A visual language (VL) is a set of visual sentences, in general, a vs is built from a 
finite set of generator elements, called the visual alphabet K of the VL. 

As an example of materialisation within a biomedical application, we will consider 
a liver biopsy during an interactive interpretation by a histologist. The histologist is 
interested in nuclei and cells, after careful observation, the histologist has identified, 
contoured and described four cell nuclei and one hepatocyte cell (enclosing one of the 
nuclei). The current description of the image, seen as a structure in itself, may be 
written as <Biopsy, 10, 4, 1> where the symbol "Biopsy” denotes the type of 
structure, 10 is the value of an attribute Image Identifier, 4 is the value of an 
attribute CURRENT # OF HEPATOCYTE Nuclei and 1 the value of an attribute 
Current # of Classified Hepatocytes. 

The concatenation of the descriptions of all the identified structures constitutes a 
description d. A function int can be defined which associates, to each recognised 
structure in the image, an attributed symbol in d. On the other hand, a histologist 
schematises his/her results by diagrams as the one shown in Figure 9. 



“ Nucleus 

o Hepatocyte Cell 



b) 

Fig. 9. A) Materialisation of the description of a biopsy, b) icons used for the above 

materialisation 

In order to associate such an image to the histological description d, a function 
mat_icon is defined, which assigns its iconic representation to each element in d. In 
this way, a visual sentence is defined, formed by <i, d, <int, mat_icon». We must 
note that the materialisation of a description does not necessarily reproduce the 
original image from which the description had been derived. Figure 10 illustrates the 
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result of the materialisation of the description by means of the icons drawn on the 
right-hand side. 




(b) 

Fig. 4. (a) Image part and (b) interpretation function of a visual sentence 



We may now define a visual language as a set of visual sentences. Let us use such 
visual sentences to describe what is normally viewed on a computer display. We 
assume a standard graphical editor which contains a number of icons. The top part of 
the figure shows the graphical editor window while the bottom part has, for each 
pictorial element on the window, its interpreted descriptional counterpart: the 
window, a command, etc. Along this approach we may consider a number of 
important features which are typical of interactive work by means of visual sentences 
as described in [17]. 

An important role is held by time during human-computer interaction; in fact 
interaction time can be studied at different levels of granularity. Interaction occurs in 
the physical time (continuous) but since it is implemented on a computer, we must be 
concerned with the discrete timing introduced by the computer clock [19]. The 
reaction time is the time taken by the computer activity triggered by a user action. 
The actual duration of such an activity could last from microseconds to days. On the 
other hand, permanence, modification, generation or disappearance of characteristic 
structures in the visual representations is the only indication that something has 
occurred during a visual interaction. Hence, we deal only with the notion of time as 
can be visually inferred by these modifications, and not with internal timings of user 
gestures. However, in order for the process to be perceived by the user as a visual 
interaction, it is necessary that a visual sentence transformation, following the user 
action, can be mapped to this. 
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Recently, some usability experts have performed measurements on web users to 
find out how much change should a web page have in order to be noticed, the answer 
was: users seem to notice changes somewhere between 12 and 30 pixels, taken from 
the Visual-L Internet list on visual interfaces and tools [18]. We assume that gestures 
are atomic and hence occur instantaneously. We equally abstract from the actual 
physical time of the external world, or the computer clock, or the time perceived by 
the user, and define a notion of time based on transformation of visual sentences. Let 
us now consider space, i.e. the surface of the screen on which our visual sentences 
will be displayed, sometimes called the screen real estate. On this space no linear 
ordering is naturally imposed, while a wealth of spatial relations can be defined on 
any two characteristic structures: they can be overlapping, disjoint, one on the left of 
the other, on the right, above, below, etc. All these relations are potentially 
significant, so their use must be constrained and clearly indicated to the user. In fact, 
one of the main problems with two dimensional parsers is directly related to the way 
an order is imposed in scanning the pictorial elements that represent the visual 
program. A first, strong, requirement is that spatial relations be used consistently 
across different sets of characteristic patterns [19]. Indeed it would be disorienting if 
in some case a relation such as ’’greater” or ’’equal” were represented by the first 
element being above the second and, for a different set of characteristic patterns in the 
same visual sentence, by the first being below the second. 

Another important issue is related to the incremental transformations of the active 
visual sentences under the submission of commands by the user and computations of 
the system. The presence and state of characteristic structures in the current image 
must define what the results of the previous interaction are and what kind of 
interactions are possible (expressed as the principle of honesty in [20]). The frame of 
the interaction process is the set of image structures which remain unchanged, thus 
providing a context to be used as graphical reference. The notion of frame can provide 
advantages both in the design of interfaces and in their usability evaluation by clearly 
identifying which structures are designated to define the context of the interaction and 
through which variations they can express information on the interaction state. 

Moreover, the definition of frame places two constraints on its elements: a 
geometrical one, structures maintain the same support; and a semantic one, structures 
are characteristic structures, i.e. are associated to a specific meaning. 



8 Summing up and open problems 

We have seen that there are different ways to define visual languages, their syntax and 
semantics as well as the role of pragmatics for the interpretation of the visual 
expressions, be they graphs, visual sentences or other graphical representations. As 
has been pointed out by many authors, visual programming languages may have some 
specific application areas where the representation of their basic elements is not far 
from the real objects belonging to the working domain and at the same time, the 
computations to be performed on them are also naturally and logically visualized. It is 
in these cases where a visual program may work at its best. On the other hand, a 
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general purpose visual language cannot be an effective and efficient programming 
solution, perhaps for the same reasons that a universal programming language has not 
yet emerged after forty years of trying. Moreover, text - as well as numbers - is not 
easy to be totally substituted by pictures, icons or graphs so that an integrated 
(sometimes called hybrid) solution may work better, coupling the advantages of 
graphical representations to those of alphanumerical symbols. We must take good 
care, and guidelines are fastly emerging from research centers, not to use textual 
constructs where figures could be simpler to grasp and not to use figures where their 
meaning is hard to interpret. Today’s problems in visual language research may be 
summed up as follows. First, and foremost, the recognition that a visual language may 
be adequate for a given application domain and not aim at the design of general 
purpose languages. Second, the formal description of its grammar is not enough to 
guarantee a single meaning to each language construct, this implies that the semantics 
of the elements and of their composition must be described in such a way that a 
unique meaning must be associated to them for both human and programs. Third, the 
contribution of psychologists and cognitivists is necessary for the design of visual 
elements which may correspond to objects/actions in the real world, where tasks must 
be solved by means of interactive computing. Fourth, and last, a community of users 
must be defined, perhaps by means of a user model, so as to ensure that the new 
visual language is addressed to such community. 



9 International events 

We have officially started research on visual languages, with paper presentations, at 
the first IEEE International Workshop (since 1993 called Symposium) on Visual 
Languages held in: Hiroshima 1984, Dallas 1986, Linkoping 1987, Pittsburgh 1988, 
Rome 1989, Chicago 1990, Kobel991, Seattle 1992, Bergen 1993, St. Louis 1994, 
Darmstadt 1995, Boulder 1996, Capri 1997, Halifax 1998, Tokyo 1999, Seattle 2000. 

Each one had its own Proceedings [21], this year the Symposium will be held in 
September in Tokyo (Japan). Some very useful references are two books on Visual 
Languages [22] and Visual Programming Languages [23]. 

In 1990, the Journal of Visual Languages and Computing, published by Academic 
Press, London, was born, it is co-edited by Shi-Kuo Chang (University of Pittsburgh) 
and Stefano Levialdi (University of Rome). 

Another series of meetings, held every two years (sponsored by ACM-SIGCHI) 
called Advanced Visual Interfaces, bears relevance to the subject of visual languages. 
Such International Working Conferences have started in Rome (1992), Bari (1994), 
Gubbio (1996), L'Aquila (1998); the next is to be held in Palermo (Italy) on May 
2000. At the same time, meetings on Visual Reasoning, Algorithm Animation, 
Scientific Visualization, User Interfaces and Computer-Human Interaction also 
contain topics which are connected to research on Visual Languages. 
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Abstract. Graph transformation is a well studied computational model 
for specification and programming. In this paper we outline a path that 
can be taken in order to turn graph transformation into a rule-based lan- 
guage for programming with diagrams. In particular, we discuss how data 
abstraction and functional abstraction can be achieved in the setting of 
graphs, by minimal extensions of the underlying graph and transforma- 
tion model. 



1 Introduction 

The rule-based transformation of graphs is a well studied field of theoretical 
computer science, see [25]. Graph transformation (also known as graph grammar 
theory, and graph reduction) has been applied successfully for modelling software 
systems and studying their behaviour, see [12]. 

So far, Progres [28] is the most successful programming language and sys- 
tem based on graph transformation. It supports functional abstraction, control 
structures (including backtracking), and encapsulation. However, Progres and 
other graph transformation languages still have some deficiencies: 

— Graphs, their central data structures, are flat] they may not contain other 
graphs as components. The concept of aggregation is missing. 

— Most of their programming concepts have only been added as textual con- 
structs, on top of the graph transformation mechanism. Thus relevant parts 
of the languages are no longer graphical and rule-based. 

We believe that a graph transformation language needs both features if it 
shall compete with visual object-oriented programming languages. Therefore we 
extend graphs by an aggregation concept, and lift a simple graph transformation 
mechanism to this model. Then we introduce functional abstraction, control, 
typing, and graph- oriented encapsulation, by slight modifications to the graph 

This work has been partially supported by the ESPRIT Working Group Applica- 
tions of Graph Transformation (Appligraph). To appear in: M. Nagl, A. Schiirr 
(eds.): Proc. Workshop on Applications of Graph Transformation (Agtive’99), Lec- 
ture Notes in Gomputer Science, April 2000. 



M. Nagl, A. Schiirr, and M. Miinch (Eds.): AGTIVE’99, LNCS 1779, pp. 165-180, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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model and transformation mechanism. The resulting language proposal is still 
completely graphical and rule-based. 

The paper is organized as follows: Graphs and a simple way of graph trans- 
formation are introduced in section 2, and extended by a concept for graph 
aggregation in section 3. Concepts for functional abstraction and control are 
proposed in section 4, and some ideas concerning typing are outlined in sec- 
tion 5. Then, in section 6, we show how subgraphs and transformations can be 
encapsulated in classes. We conclude with some remarks on related and future 
work. 

Due to space limitations, the presentation is informal. The concepts are ex- 
plained by a running example concerned with the graphical representation of 
queues and queue operations. 

2 Graph Transformation 

We introduce a notion of graphs, and graph transformation that is simple, and 
yet expressive enough to form the basis for specification and programming. 

2.1 Graphs 

Graphs represent relations between entities as edges between nodes. Usually, 
edges link two nodes. We, however, allow edges that link any number of nodes, 
and label them so that different relations, of any arity, can be represented in 
a single graph. We also allow that a sequence of nodes, called points^ may be 
designated as the interface of a graph at which it may be glued with other graphs. 
In the literature, such graphs are known as pointed hypergraphs [16,7]. 

Example 1 (Queue Graphs). The structure of queue graphs is represented by 
two kinds of edges: a Q-labelled edge is linked to the begin and end node of a 
chain of I -labelled edges; every I -labelled edge is in turn linked to the begin and 
end node of an item graph that is stored in the queue. 

Figure 1 shows how graphs are depicted in this paper. Nodes are drawn as 
circles, and filled if they are points. Edges are drawn as rectangles around their 
label, and are connected to their attachments by lines that are ordered counter- 
clockwise, starting at noon. The rectangles for binary edges with empty labels 
“disappear” so that they are drawn as lines from their first to their second linked 
node. 

We abstract from some graph features although they are important in prac- 
tice: 



— Typing is only considered as far as it makes our constructs and definitions 
well-defined. Section 5 discusses some further issues. 

— Attributes are values that may be associated to nodes and edges in order to 
represent non-structural properties of a graph. We do not consider attributes 
here although they will be used in implementations, e. g. for computing the 
layout of graphs. 
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— Notation and layout of graphs is often taylored towards a particular appli- 
cation domain. Such conventions for the drawing of nodes and edges define 
diagram languages. We restrict ourselves to the graph notation explained 
above, and refer to DiaGen [22] for a system that allows to specify diagram 
languages for the kind of graphs considered here. 



2.2 Rules and Transformation 

We use a simple kind of gluing graph transformation [9] that is compatible with 
a resticted form of substitution-based graph transformation [17]. 

A graph transformation rule t : F ^ R {rule for short) consists of a pattern 
graph P, and a replaeement graph R. A transformation step from a host graph G 
to some graph H via a graph transformation rule t is written G H and 
proceeds as follows: 

— Mateh the pattern graph P, i. e. find a subgraph P^ in G that is a copy of P. 

— Check that every node in P^ which is linked to an edge outside P^ corresponds 
to a point of P. (Otherwise, the clipping described below would leave some 
edges with dangling links.) 

— Clip P' by removing it up to its points, to obtain the context graph G. 

— Glue a copy R^ of the replacement graph P to C by identifying the points 
of P^ with the corresponding points of R\ to obtain the transformed graph H . 

A graph may host several matches, of several rules. Thus graph transformation 
is nondeterministic in general. This gives a potential for concurrency: Several 
matches of rules can be replaced in parallel if they are independent in a certain 
sense, see e. g. [9]. 

Graph transformation with a set T of graph transformation rules considers 
sequences of sequential transformation steps in arbitrary order, and of arbitrary 
length. (We ignore concurrency, as an independent parallel step corresponds to 
a sequence of sequential steps.) If there is a transformation sequence Gq 
G i ^t 2 * * * we write Gq and say that T transforms Gq to G^. 

Graph transformation can be used to define graph languages^ analogous to 
Chomsky grammars, as the set of all terminal graphs G into which T transforms 
some distinguished start graph S', where terminality is usually defined by the 
absence of certain (“nonterminal”) labels in a graph. 
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Graph transformation can be used to specify a function on graphs, like term 
rewriting [19] specifies functions on terms, by taking an arbitrary graph as in- 
put, and transforming it as long as possible. This function is partial if certain 
graphs can be transformed infinitely, and nondeterministic if a graph may be 
transformed in different ways. 

It is this last way of using graph transformation that is the basis for pro- 
gramming with graph transformation, but language generation is useful too, for 
typing (see section 5). 

Example 2 (Queue Graph Transformation). Figure 2 shows a rule that dequeues 
the first item graph of a queue graph. Figure 3 shows how this rule transforms 
the queue graph in Figure 1. The occurrences of the pattern and replacement 
graphs in the host graph, and transformed graph are drawn with fat lines. 




Fig. 2. A dequeuing rule 





Fig. 3. A dequeuing step 



This representation of queue graphs is not really adequate. It provides no 
general answer to the question: What is the item graph linked to some hedge? 
In Figure 3, we could assume that an item frame designates all nodes and edges 
connected to both its begin and end node. Then, queues could only be used to 
store connected graphs, which might be too restrictive. (We could link hedges 
to all nodes that belong to the item graph; then it would still be open which of 
the edges between those nodes in the host graph belong to the item graph.) 

3 Structured Graphs 

The graphs considered so far are composite values, but flat: Their components, 
nodes and edges, are primitive; none of them may be a graph again. That is the 
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general problem when graphs shall be composed from subgraphs, like the queue 
and item graphs in Example 2 above. To overcome this limitation, we introduce 
compound edges that may contain graphs, and extend graph transformation 
correspondingly. 



3.1 Hierarchical Graphs 



A hierarchical graph consists of a graph as considered before, called its top-level 
graphs wherein some edges, called frame edges (or just frames)^ contain graphs 
that may be hierarchical again. 

The hierarchical graphs that occur in rules may furthermore contain variable 
edges as placeholders for graphs. Variable edges bear distinguished labels, called 
variable names. A mapping fd = {Ai i-h^ Gi,... ,A^ i-h^ Gn} that associates 
hierarchical graphs Gi with variables names (1 < i < n) is called a binding. 
The instantiation of a hierarchical graph G according to a binding fd is denoted 
by Gfd^ and obtained by removing every A^-edge x in G, and gluing a copy of 
fd[Xi) to the nodes that were linked to x. 



Example 3 (Hierarchical Queue Graphs). For a hierarchical graph representation 
of queues, we turn the Q- and l-edges of Examples 1 and 2 into frames that 
contain queue and item graphs, respectively. Frames are rectangles (like ordinary 
edges) , with their contents drawn inside; they are filled in different shades of grey, 
and their labels are omitted. Variable names appear in italics. 

Figure 4 shows two queue graphs. The graph on the left hand side contains 
a variable A, a nd the graph on the right hand side is its instantiation with the 



binding < A i 






O — I A I — • -O 












Fig. 4. Two hierarchical queue graphs 



In this representation, item graphs are always complete (clippable) subgraphs 
that are disjoint to each other. This helps to maintain the consistency of the rep- 
resentation. Note that item frames may contain graphs of any arity; in Figure 4, 
they have 1, 2, or 0 points. 

To keep the presentation simple, we do not consider compound nodes that may 
contain hierarchical graphs. Such nodes can be simulated by unary frame edges 
pointing to plain nodes. In a real programming language, however, compound 
nodes should be supported, if only for symmetry. 



170 



Berthold Hoffmann 



3.2 Hierarchical Graph Transformation 

In a hierarchical graph transformation rule {hierarchical rule, for short) t:P^R, 
the hierarchical pattern and replacement graphs P, R may contain variables, but 
every variable name occurring in R must occur in P as well, and every variable 
name may occur at most once in P, 

The transformation of hierarchical graphs is then performed as follows: 

— Match the top-level of the pattern graph P, either on the top-level of the 
host graph G, or recursively in the contents of some of its frames; then 
match the contents of every frame in P recursively with the contents of the 
corresponding frame in G. 

— Bind the variables in P during matching, e. g. const uct a binding fd such 
that P^f3 is a subgraph of G. 

— Check whether P' j3 is clippable. 

— Clip P^fd to obtain the context graph C. 

— Glue an instantiated copy R^ j3 of the replacement graph it to C. 

It should be noted that the Bind step requires graph parsing which is not 
defined in general. However, for the typing considered in section 5, parsing al- 
gorithms exist, even if they are not efficient. 

Example 4 (Hierarchical Queue Graph Transformation), a Figure 5 shows two 
hierarchical graph transformation rules for enqueuing and dequeuing. Figure 6 
shows an enqueuing transformation, followed by a dequeuing transformation. 





Fig. 5. Hierarchical rules for enqueuing (top) and dequeuing (bottom) 



The variables Q and X binds queue graphs, and item frames, respectively. 
Fnqueuing duplicates an item frame with its entire contents, by duplicating the 
variable X in its replacement graph; dequeuing deletes an item frame, again with 
its contents, by deleting X in its replacement graph. 

Note that the time required for enqueuing and dequeuing does not depend 
on the length of the queue, wheras at least one of these operations would need 
at least logarithmic effort in a term rewriting implementation [6]. 
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Fig. 6. An enqueuing, followed by a dequeuing transformation 



4 Abstraction and Control 

Graph transformation provides no explicite way to compose transformations 
from simple ones, and no means to control the order in which rules shall be 
applied. We introduce transformation predicates as a means to structure and 
parameterize transformations. We extend rules by application conditions, and 
show that this can be used to specify control flow. 



4.1 Transformation Predicates 

A graph transformation rule t can only “call” other rules by indicating, in its 
replacement, places where other rules shall be applied later. This can be done by 
inserting a p- labelled edge e in the replacement of t that appears in the pattern 
of a rule that shall be applied there. The label p can then be considered as the 
name of and e’s links indicate the parameters to which it shall be applied. 

We distinguish certain labels as predicate names. A graph transformation 
predicate (or just predicate) consists of a predicate name p that is associated 
with a set of graph transformation rules, called its body. Every pattern in the 
body of p contains exactly one p-labelled edge. An edge labelled by a procedure 
name is called a button edge (or just button)^ and is depicted as an oval. 

Predicates are called by inserting buttons into the start graph of a transfor- 
mation, or into the replacement graphs of rules and predicates. 

A predicate is applied by applying a rule of its body to one of its calls in the 
host graph. A predicate is evaluated by applying it, and evaluating all predicates 
that are called in its replacement, recursively. 

The links of a button point to the parameters of the transformation predicate. 
A parameter can be just a node, or an edge with its linked nodes. In particular, 
such an edge can be a frame that contains a graph parameter (as in Example 5 
below), or a button that denotes a predicate parameter (as in Example 6 below). 
Thus buttons are meta edges] they may not only have links to nodes, but also 
meta links to other edges or to meta edges. 
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4.2 Success and Failure 

The application of a predicate is very similar to the application of simple graph 
transformation rules. However, for a transformation predicate, the following 
question arises: What happens if a predicate is called, hut none of its rules ap- 
plies? This situation can be handled in one of the following ways: 

— The transformation predicate, or its call, is considered erroneous, and trans- 
formation aborts. 

— The call is considered to fail, and cancelled so that another rule may be 
applied instead. 

— The call is considered to succeed, and transformation may continue. 

The first interpretation is that of functional languages, and the latter ones are 
used in logical languages. We allow both. In any case, buttons are always re- 
moved during the transformation because they are meta edges that are just in- 
troduced to control the program’s execution, but shall not be part of the graphs 
it computes. 

Success, failure and abortion of a transformation predicate are specified in 
its body: We require an otherwise definition (starting with a symbol), fol- 
lowed by one of the symbols , or “T” , for success, failure, and abortion, 

respectively. 

So we refine the application of predicates as follows: Whenever it turns out 
that no rule applies to a button, the predicate’s otherwise definition is interpreted 
as described above. 

Example 5 (A Graph Transformation Predicate). In Figure 7 the rule of Exam- 
ple 4 is re-specified as a transformation predicate dequeue that is parameterized 
by queue frames. 





(dequeue) 
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Fig. 7. The transformation predicate dequeue 



The body of dequeue contains a single rule; its otherwise definition ” 

leads to failure if it is applied to an empty queue. 

4.3 Application conditions 

A transformation predicate can be applied as soon as one of its pattern graphs 
matches a clippable part of the host graph. All predicates inserted by the appli- 
cation can then be called in arbitrary order. However, the application will often 
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succeed only if some of these predicates evaluate successfully. It makes sense to 
evaluate them first, before attempting to evaluate the rest. 

We extract these “critical” predicates calls as application condition^ and de- 
note a conditional rule SiS t : F \ A ^ R. It is applied as follows: If the pattern 
graph F matches, its application condition A is glued to the host graph, and 
evaluated completely. Only if this succeeds, the pattern occurrence F^ is replaced 
by a copy of the replacement graph R] otherwise, the rule is not applicable, 
and the remainders of A are removed. 

Figure 8 below shows a predicate with a conditional rule. 



4.4 Control 

Transformation predicates already provide two simple control mechanisms: 

— Pattern matching and otherwise definitions allow for case distinction. 

— Applicability conditions specify which predicate calls in a rule are evaluated 
first. 

With recursion, and by using comhinator predicates that have predicate param- 
eters, this suffices to specify control within the language. 

Example 6 (A Control Comhinator). a Figure 8 shows a control combinator nor- 
malize that applies to a transformation predicate denoted by the variable 
evaluates T as an application condition, and, if that succeeds, calls itself recur- 
sively. As T shall bind to predicate calls with any number of parameters, we use 
the dot notation to indicate that T links to a varying number of nodes. Where 
the 7 -button is used as a predicate parameter, it is disguised as an ordinary edge 
by drawing a frame around its button. This prevents it from evaluation while it 
is “carried around” (in the pattern and replacement graph of the rule). 



normalize: 



formal iz^ 
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Fig. 8. The control combinator normalize 



in Figure 9, normalize is applied to a disguised call of dequeue. Every applica- 
tion of normalize removes one item frame by evaluating dequeue as an application 
condition, until the queue frame contains no item frame, and dequeue fails. The 
empty queue graph is represented by a single node; the numbers 1 and 2 attached 
to it shall indicate that this node is the first, as well as the second point of the 
queue graph. 
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Fig. 9. An evaluation of normalize 



The use of conditional rules is crucial for the termination of normalize: Had 
the call of T been inserted in the replacement graph, normalize could loop in its 
recursion without ever applying T . If defined as above, normalize will only loop 
if T does not terminate. 

Other imperative control structures like while loops, or functional comb in a- 
tors like map and reduce as in Haskell [23] can be defined in a similar way The 
most common of them should be predefined in the language. 

5 Typing 

Typing specifies how the data of a program is structured, and establishes rules 
for applying operations to data. These rules can be checked, preferably just 
by inspecting the program, before executing it, in order to ensure that it is 
consistent. 

5.1 Graph Structures 

In modern programming languages like Haskell [23], a type definition 

Intlist : := Nil | Cons Int Intlist 

specifies the structure of data recursively. We introduce similar definitions for 
the structure of graphs. 

A distinguished set of labels is used as type names^ and the strueture of a 
graph type named T is specified by a graph strueture definition of the form 

1 ::= Oi I O 2 I ' ' ' I Gn 

where the graph T consists of a T-edge with its linked nodes, and the Gi are 
graphs that may contain type edges again. Graph structure definitions can be 
considered as predicates that generate the graph values of a type. 
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Fig. 10. The structure of queue and item graphs 



Example 7 (Typing Rules for Queue and Rem Graphs). Figure 10 defines the 
graph structure of queue and item graphs. 

These graphs may be contained in queue and item frames, respectively. Queue 
graphs may be bound to queue variables like Q, whereas the variable X in the 
previous examples may be bound to a frame containing an item graph. The 
type Qt- of queue graphs is generie. The type parameter r can be instantiated 
by any graph type, e. g. to to the type Q| used in the previous examples. 

The rules used for graph structure definitions are a well-studied special case 
of eontext-free graph transformation, see [16,7]. Type checking thus amounts to 
eontext-free graph parsing^ e. g. as implemented in DiaGen [22]. 

5.2 Predicate Signatures 

The signature of a transformation predicate shall specify to which kind of pa- 
rameters it applies. As predicates are represented as graphs, their signature can 
be specified by graph structure definitions for a type tt of predicates. 

Example 8 (Signature of Queue Predieates) . Figure 11 specifies the signatures 
for the predicates used in our examples. 





Fig. 11. The signature of queue predicates 




The predicate type tt has a varying number of parameter nodes so that the 
rules of the graph structure definition have different left hand sides. All predicate 
calls occurring in the examples of this paper can be derived with these rules 
(together with those of Figure 10). The predicate variable T used in example 6 
is of type tt. 
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5.3 Fine-Grained Typing 

So far, edges are the only graph components that are typed explicitely, by la- 
belling them with different symbols. In a real programming language, nodes 
should be typed in the same way A particular type of edge could then be re- 
stricted to have a certain number of links, to nodes of certain types, as in typed 
graphs [4]. 

The degree of nodes, e. g. the number of links to some node, could be restricted 
as well, typically by cardinality expressions like 1, 0..1, l..n, 0..n. Similarly, the 
overall number of nodes or edges contained in a graph could be restricted. Such 
eardinality eonstraints are allowed in Progres [28]. 

Pointed graphs could be specified to have a certain number of points, of cer- 
tain types. Then the links of a frame, and the points of its contents could be 
required to correspond in their number, and their types. A similar correspon- 
dence could be required for bindings, between variable edges and graphs, and 
for rules, between pattern and replacement graphs. 

6 Encapsulation 

Programming-in-the-large relies on the encapsulation of features in modules so 
that only some of them are visible to the public, and the others are protected 
from illegal manipulation. We sketch how graph classes and objects may be 
added to our proposal, and refer to packages that allow to group classes into 
subsystems. 

6.1 Graph Classes and Objects 

A graph elass defines a graph type (denoted by the name of the class), and 
declares graph transformation predicates as its methods. The name of this type, 
which equals that of the class, and some designated methods are puhlie. The 
structure of the type, and the other methods, are private. 

Example 9 (The Queue Class), aln Figure 9 we encapsulate primitive operations 
on queues within a class. 



fenqueueb enqueue: . . . 
(dequeue^ dequeue: . . . 



Qr 



Qr 
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In this small example, all methods are public. However, the graph structure 
is visible only inside the class definition, thus adhering to the principle of data 
abstraction. 

Graph objects are frames that contain a graph of some graph class C , In other 
classes, an object can only be manipulated by invoking a method of C . 

At first glance, information hiding seems to contradict the expectation that 
the objects of a visual program shall be completely visible to a user. However, 
information hiding applies only to the program, not to the graphs manipulated 
in it. This can be defined by views, see [11]. 

6.2 Graph Packages and Libraries 

Experience with object-oriented languages (e. g. Java [2]) has shown that the 
rather fine-grained encapsulation concept of classes should be complemented 
with a simple coarse-grained package concept that allows to group a set of re- 
lated classes in a subsystem. Such a concept can be easily added along the lines 
of [18], as it just structures the namespace of a program, whithout changing the 
evaluation. 

Then libraries of graph classes can be easily predefined. Also non-graphical 
values like numbers and strings can be considered as predefined “graph” classes 
if the language allows graphs to be visualized in a non-standard way, e. g. as 
text. Then, the language is purely graphical, at least on the conceptual level. 
(B. Meyer [21] states such a purism principle for object-oriented languages). 

7 Conclusion 

In this paper we have proposed how a simple notion of graph transformation 
can be developed towards a programming language. Basically, this has been 
achieverd by distinguishing several kinds of edges: 

— Frames allow graphs to be structured hierachically so that hierarchical sub- 
graphs may be linked via their points. 

— Variables bind subgraphs in rules so that they may be deleted or duplicated 
in a single rule application. 

— Buttons allow graph transformations to call each other recursively, with pa- 
rameters. Based on application conditions and higher-order predicates, con- 
trol structures can then be defined in the language itself. 

— Types define the structure of graph languages and the signature of predicates. 

— Objects are frames containing graphs of some type, with methods defined by 
a graph class. Classes provide for a fine-grained data-driven module concept. 

The extensions are graphical, rule-oriented, object-oriented, with some logical 
and functional flavour (backtracking, and higher order predicates). These ideas 
could become the kernel of a “complete” graph transformation language that 
overcomes major deficiencies of today’s graph transformation languages. 
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Related Work 

Structured graphs have already been proposed by several authors: The hierarchi- 
cal graphs of [24,13] have compound nodes. Pratt [24] considers only language 
generation similar to Example 7, and Engels and Schiirr [13] do not consider 
transformation at all. Schneider [27] considers graphs that have (simple) graphs 
as node and edge labels. The graphs of the old Agg system [20] support a rigid 
layering: Graphs, and the mappings between them can be viewed and manipu- 
lated as nodes and edges on the next layer of abstraction. This helps to structure 
the systems rather than the graph values. 

Predicates exist in Progres [28], but without graph and predicate param- 
eters. The language also provides textual logical and imperative control struc- 
tures. Fujaba [15] has graphical control structures, similar to Uml [26]. 

Modules have recently been added to Progres; they support functional and 
data abstraction, but not graph aggregation. We are currently not aware of any 
other language or language proposal that features graph aggregation and classes. 
However, the new Agg system [14] and the Fujaba system [15] allow to use 
object-oriented concepts of their implementation language Java. After all, graph 
structuring can then be realized by implementing plain graph objects as node 
or edge attributes of other graph objects. 



Future Work 

This work is closely related to Grage [1], a design activity for an approach- 
independent graph-centered specification and programming language. Specifi- 
cation issues have been ignored here, and complete independence of particular 
notions of graphs and graph transformation had to be given up because hierar- 
chical graphs require certain properties of graphs. However, although this paper 
is based on hypergraphs [16] and the gluing approach [3], it is not completely 
approach-specific: Nesting can be defined by adding points, frames and buttons 
to any kind of graph that has nodes, and a suitable notion of subgraph matching; 
several transformation approaches can easily be implemented with the rules and 
predicates proposed here. So we hope that our ideas will be fruitful for Grage 
too. 

The precise definitions of the concepts presented in this paper has been 
started in [8] (for frames and hierarchical graph transformation), and will be 
continued. Some more concepts, like concurrency and distribution^ have still to 
be considered. For plain graph transformation, these concepts have been studied 
in [29] and [30] so that there is some hope that these results can be extended to 
our model. 

The visualization of graphs, rules, predicates and classes is still very elemen- 
tary in this paper. A lot of work has to be done in this area if graph transfor- 
mation shall become a really attractive paradigm for programming and spec- 
ification. Last but not least, such a language has to be implemented with a 
comprehensive program development environment. 
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Abstract. Current production control systems for e.g. a factory for cars or 
any other complex industrial good face two major problems. First, produc- 
tion control systems need to become (more) dezentralized to increase their 
availability. It is no longer acceptable, that a failure of a single central pro- 
duction control computer or program causes hours of down-time for the who- 
le production line. Second, todays market forces demand smaller lot sizes and 
a more flexible mixture of different products manufactured in parallel on one 
production line. Common specification languages for embedded systems, 
like SDL, statecharts, etc. focus on the specification of (re)active components 
of production control systems like control units, actors (e.g. motors, valves), 
and sensors (e.g. switches, lightborders, pressure, and temperature sensors), 
and on the interaction of such reactive components via events and signals. 

They provide no appropriate means for the specification of (more) intelligent, 
autonomous production agents. Such autonomous production agents need 
knowledge of manufacturing plans for different goods and of their surround- 
ing world, e.g. the layout of the factory or the availability of manufacturing 
cells. In addition, such production agents have to coordinate their access to 
assembly lines with other competing agents. This paper proposes to use (ob- 
ject-oriented) graph structures for the representation of production agents and 
graph (object structure) rewrite rules for the specification of their behaviour. 

We show how the FUJABA environment may be used to specify production 
agents and generate their implementation and to validate them via a graphical 
simulation. 

1 Introduction 

In [1] Blostein states that practically applicable graph transformation systems should 
allow to combine graph transformations and conventional (object-oriented) 
programming code, seamlessly. In order to achieve this, Fujaba^ combines common 
object-oriented notations, i.e. UML class diagrams, UML activity diagrams, and UML 



1. From UML to Java And Back Again 
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collaboration diagrams, into a visual specification language that integrates object- 
oriented modeling, normal (Java) code, and graph transformations ([2], [3], [8]). 

Like other CASE tools, the Fujaba environment allows to edit UML class diagrams and 
provides a code generator that generates Java classes containing attributes and method 
declarations. In addition, the Fujaba environment generates canonical implementations 
for associations. To specify method bodies, Fujaba uses so-called story diagrams. Story 
diagrams combine UML activity diagrams and UML collaboration diagrams. Activity 
diagrams define the high-level control flow between different activities. The activities 
may be specified either by normal Java code or by a graph transformation depicted as a 
UML collaboration diagram. The control flow specified by an activity diagram is 
translated into standard Java while and if statements (see [6]). Activities containing 
standard Java code are just copied into the resulting control structures. The graph 
transformation / collaboration diagrams are translated into Java code that relies on the 
class features generated from the class diagram, only. This is done using query 
optimization techniques adapted from the database field and refined for graph 
transformations ([2], [12]). The resulting code employs usual main memory object 
structures to represent the host graph. It does not rely on special graph libraries or graph 
rewrite rule interpreters but it manipulates the object structures by usual pointer 
operations. Thus, the resulting code blends seamlessly with other system parts and is 
not resource demanding. Finally, it is 100% pure Java code, that runs on any Java 
platform. Altogether, this enables the application of graph rewriting techniques for 
various new areas, e.g. for embedded systems. 

As part of the ISILEIT project funded by the German Research Society (DFG) our 
department studies the application of formal methods to embedded systems. Actually, 
the running example of this paper stems from the reference case study of the ISILEIT 
project [8]. The ISILEIT project is a joined project with electrical and mechanical 
engineers. The focus is not just on theoretical results, but there are also strong demands 
for practical benefits, like e.g. the generation of running production systems from their 
formal specification and the reduction of system reconfiguration times. Thus, we 
propose to attack these problems with Fujaba. 

Embedded systems are not yet a well known application area of graph rewriting 
systems. Common specification languages for embedded systems, like SDL, 
statecharts, etc., focus on the specification of (re)active components of production 
control systems like control units, actors (e.g. motors, valves) and sensors (e.g. 
switches, lightborders, pressure and temperature sensors) and on the interaction of such 
reactive components via events and signals. However, these common specification 
languages provide no appropriate means for the specification of (more) intelligent, 
autonomous production agents. Such autonomous production agents need knowledge 
about manufacturing plans for different goods and of their surrounding world, i.e. the 
layout of the factory and the availability of material and manufacturing cells. In 
addition, such production agents have to coordinate their access to assembly lines with 
other competing agents. Finally, production agents should deal with unforseen 
situations, like the drop-out of a production line or changes to the production plans. 
Altogether the requirements for intelligent production agents are close to general 
process modeling requirements. Fortunately, there exists already a sophisticated graph 
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grammar based approach to process modeling, namely the Dynamite project [5]. Thus, 
our approach has taken the dynamic task net idea from the Dynamite project and 
adapted these task nets for the needs of intelligent production agents and uses Fujaba to 
specify and implement these nets. 

In the following chapter a short introduction of a simple production system example is 
given. Chapter 3 describes the class diagram and the activity diagram of a method. In 
chapter 4 a cut-out of the generated Java code for the specified method is presented. The 
following chapter introduces the test environment of Fujaba. The last chapter gives 
conclusions and future work issues. 

2 Sample Factory Example 

Figure 1 shows a scenario of a sample factory used as running example within the paper. 
The example stems from a real world system that serves as the case study for our 
ISILEIT project funded by the German Research Society (DFG) [8]. The factory is 
modeled as a flat building without levels and pillars in it. The floor is layered with 
rectangle shaped fields allowing to address certain positions in the building. The factory 
contains certain kinds of production places. A production place is e.g. an assembly line, 
where goods arrive and are loaded on shuttles. 



Good Robot Assembly Line 




Figure 1 Simple factory example 

Shuttles are able to move over the floor, where the fields on the floor serve as a map. 
Each field can be allocated with only one shuttle, so a shuttle must be able to dodge a 
field which is allocated by another shuttle. Shuttle accept orders to carry goods from an 
assembly line to a storage. Thereby, shuttles can only carry one good and orders will be 
executed until there are no more goods on the assembly line, the storage is full or the 
shuttle receives a new order. 

In order to meet these requirements and to force the autonomy of shuttles, a working 
plan should be specified for example with a task net. The net may look like the one 
shown in Eigure 2. Each shuttle can be assigned to a working plan (here produce_good) 
and will then execute the actions in the working plan autonomously. Tasks may require 
certain resources like task mill_cutting, which will require an assembly line later on. The 
task net is hierarchically organized, e.g. task produce_good has the subtasks 
fete h_mate rial, mill_cutting, and deliver. The tasks are conneted by next links in order to 
specify a sequence of execution. Some tasks may have more than one successor task. 
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Figure 2 Task net for shuttles 

modeling independend steps that may be executed in any order. Similarity, one task 
may require multiple predecessor tasks to be completed until it becomes executable. 
Based on this task net idea, the following chapter outhnes the static aspects of our 
example. 

3 Class and Story Diagrams 

Based on the requirements and motivations outlined above, we propose the following 
sample design for intelligent production agents, cf. Figure 3. Class Shuttle models 
transport vehicles which are our production agents. These production agents need to 
know about the Factory they are living in and about the AssemblyLines and Storages 
they are negotiating with and about Goods to be produced. Therefore, class Actor has a 
actors association to the factory. Nevertheless, class Actor is a design decision we made 
for this example and encapsulates common parts of other classes, only. Next, 
production agents need to be aware of their own location and of the locations of 
assembly lines and storages. This is achieved via class Field. Each field has a certain 
position within the factory stored in its attributes x and y. Links of type horizontal and 
vertical allow shuttles to move from a field to its neighbors. A target link identifies the 
current movement target and at links model the current locations of all Actors. Next, 
class Task models the working plan of a production agent as described above. We use 
subtasks links to structure hierarchical plans and next links to represent the successor 
relationship. Finally, tasks may allocate assembly lines or storages via resource links. 
Note, this is a very simple modeling of work plans for simplification reasons. More 
sophisticated versions could e.g. model priorities, durations, and other pertchart 
properties, in addition. The root of a production plan is attached to its executing 
production agent via a plan link. In addition, a link named current identifies the active 
task. So far a production agent has knowledge about the configuration of the factory it 
is living in. To allow multiple production agents to coordinate themselves, they need to 
know about each other and about their plans. Actually, the class diagram described so 
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Figure 3 Class diagram of the factory example 

far allows to represent such information within each production agent. However, to 
minimize communication efforts, shuttle coordination is restricted to waiting queues at 
the different assembly lines. When planning a manufacturing task, a production agent 
may look up the waiting queues of assembly lines in order to choose one with minimal 
waiting time. 

Equiped with this static design, we are now able to specify the behavior of our 
production agents using story diagrams. Story diagrams combine UML activity 
diagrams and graph rewrite rules. Therefore, UML activity diagrams contain activities 
and directed transitions among activities to specify the (high-level) control flow. In 
UML, activities can contain "pseudo” code, only. Story diagrams extend the notation 
by using either Java code or graph rewrite rules for the specification of activities. 
Special transition guards allow to branch on the success or failure of a graph rewriting 
step. 
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Our production agents shall not just execute a fixed plan (which could actually just have 
been hard coded) but they should refine their production plans depending on different 
situations like e.g. the availability of assembly lines. As a simplified example for such 
a flexible planning step, Figure 4 shows a story diagram used to plan a millcutting step. 
Fujaba uses a notation for graph rewrite rules that is similar to UML collaboration 
diagrams. In this notation, the left- and right-hand sides of a rewrite rule are displayed 
in one picture: elements to be deleted are canceled with two (red) lines and newly 
created elements are marked by attached (green) plusses (see [2], [3] for more details). 
Such a graph rewrite rule is executed by first finding a match for the variables and links 
in the rewrite rule to objects and links in the runtime object-structure and then executing 
the depicted modifications. Thus, the example operation millCutting looks up its factory 
and seeks for an assembly line al that (1) is in state active, (2) that currently operates a 
millcutter, and (3) that has no other shuttle waiting in its queue. In addition, we look-up 
field f that identifies al’s position. On success, operation millCutting creates the three 
new subtasks go, hand_over, and take_over for the millcutting step. Field f is marked as 
the shuttle’s next movement target and the shuttle enqueues itself at the assembly line. 
Finally, tasks hand_over and take_over mark the chosen assembly line al as their 
resources. Note, that Figure 4 shows only the first task refinement attempt. When the 
depicted graph rewrite rule is not applicable, execution proceeds along the [failure] 
transition and considers less optimal plan refinements. Otherwise, we proceed along the 
[success] transition and operation millCutting terminates and execute the new current go 
task. 




Figure 4 Story-diagram of operation millCutting 

For each kind of task, such an execution routine is provided. Plan execution starts at the 
plan root. It chooses an executable task and calls the corresponding execution routine 
and marks the tasks as done. In addition, our model allows to suspend currently 
executed tasks in order to switch to another executable task e.g. for optimization reasons 
or in order to react on unforseen difficulties like e.g. the drop-out of an assembly line. 
Once a plan is fully executed, it is replaced by a new one and execution starts again. In 








Using Fujaba for the Development of Production Control Systems 187 

addition, a shuttle might keep some 
approved plan parts in order to retain 
successful execution pathes. 

Note, in addition to the shown plan 
execution operations that model the 
knowledge of our production agents 
and their manufacturing strategies, 
shuttles also need to control their 
sensors and actors like light bars and 
their motors and they have to react on 
signals sent from other agents or from 
humans, e.g. assigning them new 
tasks. This reactive behavior is well 
addressed using SDL and statecharts. 

Thus, Fujaba provides support for 
SDL and statecharts, too. This topic is 
addressed in the Fujaba tool 
demonstration description, which is 
part of this volume, too [10]. 

4 Java Code Generation 

FUJABA provides a generator, which 
generates Java code out of a 
specification. For each class in a class 
diagram the corresponding Java class 
is generated in a canonical way. 

Attributes are encapsulated and 
access methods are generated. 

Method declarations are mapped 
directly into corresponding Java 
method declarations and associations 
are mapped to pairs of references and 
adequate access methods within the 
corresponding classes (for more 
details see [2], [3]). 

The Java code generation for story 
diagrams is divided into two tasks. 

First, the control flow is mapped to 
imperative control structures like if, 
and while statements. To enable this 
translation, story diagrams are 
restricted to so-called well-formed 
transition structures that correspond Figure 5 Java code for graph rewrite rules 
directly to nested branches and loops. 



1 


public void millCutting () { ... 


2 


// first graph rewriting rule 


3 

4 


try 

{ 


5 


sdnnSuccess = false; 


6 


nnillCutting = this.getCurrent (); 


7 


SDM.ensure (nnillCutting != null); 


8 


factory = this.getFactory(); 


9 


SDM.ensure (factory != null); 


10 


Iterator actors = factory.iteratorOfActors (); 


11 


while ( ! sdnnSuccess && actors. hasMore ()) { 


12 


tnnp = actors, next (); 


13 


try{ 


14 


SDM.ensure (tnnp instanceof AssennblyLine); 


15 


al = (AssennblyLine) tnnp; 


16 


SDM.ensure (al.getState () == "active"); 


17 


SDM.ensure (al.getTool () == "nnillcutter); 


18 


SDM.ensure (al.queuelsEnnpty ()); 


19 


f = al.getAt (); 


20 


SDM.ensure (f != null); 


21 


// nnatch found, execute nnodifications 


22 


this.setCurrent(null); 


23 


goto = new Task ("goto"); 


24 


handover = new Task ("handover"); 


25 


takeover = new Task ("takeover"); 


26 


this.setCurrent( 


27 


this.setTarget (^; 


28 


al.addToQueue (this); 


29 


nnillCutting.addToSubTasks (goto); 


30 


nnillCutting.addToSubTasks (handover); 


31 


nnillCutting.addToSubTasks (takeover); 


32 


goto.addToNext (handover); 


33 


handOver.addToNext (takeover); 


34 


handOver.addToResources (al); 


35 


takeOver.addToResources (al); 


36 


sdnnSuccess = true; 


37 


} catch (SDM. Exception sdnnExcept) { } 


38 


} // while (actors.hasMore ()) 


39 


} catch (SDM. Exception sdnnExcept) { } 


40 


if (sdnnSuccess) 


41 


{ 


42 


return; 


43 


}else 


44 


{//next graph rewrite rule 


45 
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Figure 5 shows the Java implementation of the millCutting method of class Shuttle. 
Lines 2 to 39 implement the first graph rewrite rule shown in Figure 4. The if statement 
in line 40 realizes the control flow depicted by the success and failure transitions at the 
bottom of Figure 4. If the first graph rewrite rule was successful, the first if-branch 
terminates the execution. Otherwise, the else branch is executed. 

In a second task, the code for activities is generated. Activities, that contain just Java 
code are copied one-to-one. For graph rewrite rules we employ translation mechanisms 
as described in [2], [12]. In Figure 5, lines 2 to 39 show the generated Java code for the 
first graph rewrite rule. The execution starts with binding objects to the variables 
specified in the rule. For example, in line 6 the variable millCutting is bound to an object 
which is accessable via an current link from the this object. Line 7 checks whether line 
6 actually retrieved an object and throws an exception, otherwise. This exception is 
caught within the catch-statement at line 39. Note, variable sdmSuccess is set to false 
in line 5 and thus it signals that the execution of the first graph rewrite rule has failed 
until it is set to true in line 36. Line 36 is reached only when all SDM. ensure clauses are 
passed, successfully. While line 6 looks-up a to-one association, the link from variable 
this to variable al belongs to a to-many assoctiation, cf. Figure 3. Thus, we need a loop 
(line 10 to 12) to look up all reachable neighbors until we reach one that meets all 
requirements: we are looking for an assembly line (line 14) which is active (line 16) and 
which operates a millcutter (line 17) and which has an empty queue (line 18). Once all 
participants are identified, we execute the deletions (line 22) and create new objects and 
links (line 23 to 35) and finally we signal success of the rewrite step (line 36). Note, that 
the latter aborts the while loop in line 1 1 . See [3] for more details on the code generation 
for graph rewrite rules. 

The important properties of the generated Java code are, that it operates on usual main 
memory object structures and that it uses only small library functions like predefined 
container classes and the rule execution is programmed built-in, it does not rely on an 
additional rule interpreter. Thus, the resulting code is not very resource demanding. In 
addition, it is 100% pure platform independant Java code that does not use any native 
methods. Altogether, these features enable us to use Fujaba for the generation code for 
embedded systems. 

5 Simulating the specification 

Our approach to specify production control systems already allows to construct very 
flexible production agents that allow very small lot sizes and that may manufacture 
different goods in parallel. These production agents are able to deal with unforseen 
situations, like assembly line drop-outs. They form a decentralized control system that 
is not threatened by the drop-out of a single central production control computer. Still, 
we have to meet the requirement of being able to switch to new products without long 
down-times caused by system tests. To meet this requirement, we propose to test 
production processes beforehand with Fujaba’ s graphical debugging and simulation 
environment, called DOBS (Dynamic Object Browsing System), cf. Figure 6. The 
DOBS environment allows to visualize (Java) runtime object structures and to invoke 
methods on objects, interactively. For parameterized methods, appropriate user dialogs 
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Figure 6 Simulating the sample factory example 

are created, dynamically. In addition, DOBS is able to deal with the (re)active objects 
generated by Fujaba, that run their own thread and thus may change their state or 
execute methods, autonomously. Thereby, DOBS may serve as a first simple graphical 
user interface allowing to test a story diagram specification. For example, the user may 
initiate a production process and then e.g. simulate the drop-out of a certain assembly 
line and analyse the reaction of the running production agents. On the contrary, during 
the simulation one might recognize some bottlenecks and try to solve the problem by 
adding more assembly lines or shuttles, interactively. Another option is to reshape the 
floor layout or to move some assembly lines around in order to shorten distances. 
Deletion and addition of floor elements might also be used to simulate walls, doors, and 
pillars. Finally, one might even "edit” the task net of some production agents in order 
to test alternative production plans. 

6 Conclusions and Future Work 

This paper shows the applicability of graph rewrite techniques within the area of 
embedded systems. Due to our experiences within the ISILEIT project, graph rewrite 
rules are an ideal means for the specification of the general behavior of flexible 
production agents. The high level of abstration provided by graph rewrite systems 
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allows us to model the ’world’ our production agents are living in and the knowledge 
they need to execute their tasks, very easily. The experiences drawn from the Dynamite 
project enabled us to provide our production agents with very flexible manufacturing 
plans. However, to turn embedded systems into an application area for graph rewrite 
systems, several properties of the Fujaba approach were very important. First, the UML 
like notation employed by Fujaba facilitated the communication of story diagram 
specifications to the other ISILEIT project partners, significantly. Note, a number of 
these partners stem from fields like electrical and mechanical engineering. Second, 
todays embedded systems have certain resource restrictions. This topic is well 
addressed by the Fujaba code generation strategies, which produce simple Java code 
using usual main-memory object structures. We expect that Java will become availabe 
for wide spectrums of embedded systems, soon. Then, the 100 % pure Java code 
generated by Fujaba will be executable on the quite heterogenous hardware platforms 
employed in the area of embedded systems. 

One unexpected advantage of graph rewrite techniques and of the code generation 
strategies of Fujaba is their quite defensive programming style. As Figure 5 shows, our 
code checks thorougly all kinds of conditions required for the execution of a graph 
rewrite step by using numerous SDM. ensure clauses. This results in very reliable code 
that deals correctly with many kinds of unforeseen situations. During the simulations 
with our dynamic object browsing system DOBS, our engineering partners were 
impressed by the robustness of the application. One can delete assembly lines or even 
fields without causing system crashes of running production agents. It is possible to 
reconfigure the factory layout while the production agents are active and they still react 
reasonably. Once a new factory configuration is (partly) established, the production 
agents easily adapt to the changed setting and continue to produce goods. We hope to 
be able to transfer this robustness and reliability and flexibility from the simulations to 
real embedded systems. This is current work. 

FUJABA has been developed since November 1997. The current ’release’ version 
provides editors for UML class diagrams, UML activity diagrams and object structure 
rewrite rules. In addition it comprises a code generator and a basic consistency analyser. 
As current work FUJABA is enhanced by statecharts and SDL. For both languages, first 
versions of editors, consistency analysers, and code generators are available and in their 
testing phases, now. DOBS, the Dynamic Object Browsing System, has become part of 
FUJABA in the beginning of 1998. Extensions of Dobs up to a graphical simulation 
environment are also current work and are scheduled for 2000. 
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Abstract. Structured Analysis lias been one of the most widely used 
specification notations of the last decades. Friendliness and fiexibility 
promoted its use, but informality hampered its precision and efficacy. The 
many proposals that tried to overcome the problem improve precision, 
but constrain fiexibility. They propose formal and specific interpretations 
of Structured Analysis that, even if meritorious, do not impact on day- 
to-day practice. To meet the goal, formalization attempts should not try 
to impose particular interpretations, but they should allow users to tailor 
the interpretation to their current needs. 

In this paper, we present a solution that merges precision and fiexibility 
to provide a customizable and formal definition of Structured Analysis. 
Formalization consists of a set of customization rules and a consistency 
framework. Customization rules, based on graph grammars, formalize 
the different behaviors of notation elements by defining a mapping onto 
a formal model. The consistency framework groups complementary rules, 
which give different semantics to the same elements, and constrain the 
scope of each rule, that is, identifies the set of rules that may be affected 
by a change. 

1 Introduction 

Structured Analysis (hereafter, SA) is one of the notations that industry widely 
used in the last decades. Friendliness, accumulated experience, and flexibility 
promoted its use, but intrinsic imprecision hampered its efficacy. Many propos- 
als tried to overcome the problem by complementing SA with formal methods [9]. 
They associate SA with formal semantics by proposing mappings from informal 
specifications onto formal models. For example, Semmens and Allen ([23]) and 
Cohen ([25]) formalize SA through Z and Larch, respectively. France ([11]) pro- 
poses a translation technique from SA to SMoLC communicating processes to 
automatically generate formal representations of SA models. Elmstrpm et al. 
([17]) present a fixed set of rules to give SA a special-purpose interpretation by 
means of high-level Petri nets. Fencott et al. ([10]) propose semantic functions 
that use Z to both translate and annotate SA elements. Pethersohn et al. ([20]) 

* This work has been partially supported by the European Community under the 
ESPRIT IDERS (EP8593) and the KIT FORMSPEC Projects (KIT-125). 
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formalize SA using synchronous models. These proposals differ for the considered 
notation (graphical shapes), chosen formal method, and selected behavior, but 
they all cut flexibility by fixing particular interpretations. In contrast, informal 
notations, such as Structured Analysis, have always been used because of their 
flexibility and the many different interpretations that they permit ([7]). Thus, a 
valuable formalization should be as precise as a formal method and as flexible 
as an informal notation. Formalization efforts should consider all possible inter- 
pretations of an informal notation as well as supply means to define new feasible 
interpretations. Formalizing informal notations should mean formalizing neither 
single notations nor sets of independent notations, but notation families. A no- 
tation family comprises several related notations that share the same graphical 
symbols, but associate them with different semantics. New formalizations can 
be added to the family either by associating new interpretations with notation 
elements or by recombining existing ones. 

A first step towards formalizing notation families has been proposed in [4], 
which illustrates a formalization engine that can be customized with different 
sets of rules to address different notations. The recent attempts of formalizing 
complex notation families, such as Structured Analysis, revealed the advantages 
as well as the limitations of the approach. We verified that the approach well 
supports the formal definition of complex notations, but we also identified un- 
expected difficulties in reusing (sharing) rules among formalized notations: The 
preliminary experiments resulted in different sets of rules for each notation of 
the family. In this paper, we propose a better solution to the problem of formal- 
izing Structured Analysis as a notation family. It specializes the proposal in [4] 
by framing rules in a hierarchy that identifies complementary rules and their 
scope. Complementary rules formalize different interpretations for the same no- 
tation element. The hierarchy identifies the rules that could be affected by a 
modification. 

The remainder of this paper is organized as follows. Section 2 frames the 
problem by summarizing Structured Analysis. Section 3 describes the approach 
by introducing both customization rules (Section 3.1) and the consistency frame- 
work (Section 3.2). Section 4 exemplifies the approach by presenting some alter- 
native interpretations for making a process consume values from its input flows. 
Section 5 indicates the main results and ongoing work. 

2 Structured Analysis 

Structured Analysis ([18, 12, 24, 15, 8]) is a notation family that has evolved from 
mid seventies to today. In this paper, we focus on De Marco-like notations; real- 
time extensions are studied in [2]. 

Structured Analysis comprises processes, data flows, data stores, terminators, 
and splitting and merging points. Processes model functional data transforma- 
tions. They are given different interpretations in terms of the number of con- 
sumed inputs (all/some), the number of produced outputs (all/some), and the 
duration of their executions. Data flows model the flow of data among processes 
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and terminators. They are given different interpretations depending on the ef- 
fect of reading (destructive or not), writing (blocking or not), and capacity (one 
versus several queued messages). Data stores model memories; feasible interpre- 
tations include presence or absence of structure and different access rights (e.g., 
blocking versus non-blocking). Terminators model the external world (embed- 
ding). They are interpreted as either pure interface elements or light processes 
that provide inputs to the system and store results. Splitting (merging) points 
describe the convergence (divergence) of flows. Splitting and merging points can 
be either passive, that is, they do no involve data transformation, or active, that 
is, they filter input data. 

Some of the interpretations for processes are exemplified in Figure 1, which 
presents a simple SA model with a single process that defines a hot drink dis- 
patcher. 



milk 




Fig. 1. Process Dispense Hot Drinks 



Process Dispense Hot Drinks has two input flows, milk and coffee, and 
produces a hot drink on the output flow. The data consumption from input 
flows can be interpreted in several ways, leading to different interpretations. If 
the process read values from single flows only, it would be able to produce coffee 
and milk, but no combinations of them (i.e., capuccino). If the process read values 
from both inputs, it would produce capuccino, but not milk and coffee separately. 
If the process read values either from a single flow or from both of them, it 
could produce milk and coffee as well as capuccino. If both coffee and milk were 
available, but only coffee were used, we would have different alternatives to define 
the way to handle unused values: They could remain available for subsequent 
use or be deleted as not used. Even through this simple example, we can identify 
three ways for consuming input values and two ways for dealing with unused 
inputs, thus defining six different reasonable behaviors for process elements. 

3 Formalization Approach 

The formalization approach proposed in this paper is based on formal rules, 
called customization rules, that are organized in a hierarchical framework, called 
consistency framework. Customization rules allow users to associate the behavior 
they prefer with each notation element by defining a functionally equivalent high 
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level timed Petri net (hereafter, HLTPN). The customization framework defines 
the scope of customization rules and supplies the glue that transforms a set of 
rules into a consistent and complete interpretation of SA. 



3.1 Customization Rules 

Customization rules are based on the approach described in [4]. They recall the 
proposals for defining graphical languages presented in [22, 21]. Each rule is a pair 
of attributed programmable graph grammar productions. The Abstract Syntax 
Graph Grammar (ASGG) production identifies a legal syntactic transformation 
on the informal model at an abstract level. The corresponding Semantic Graph 
Grammar (SGG) production defines the semantics by suitably transforming the 
semantic representation. In this paper, we illustrate the approach by formalizing 
Structured Analysis with HLTPNs. HLTPNs are Petri nets where tokens are 
associated with information and transitions are associated with a predicate (a 
condition on the input tokens), an action (the functional transformation between 
input and output tokens), and a timing function (minimum and maximum firing 
time of the transition). Interested readers can find a detailed presentation in [13]. 
A graph grammar production comprises nodes and edges. Nodes can either in- 
dicate notation elements or notation connectors, or be purely abstract symbols. 
Nodes of ASGG productions can correspond to processes or terminators (i.e., 
elements), or flows (i.e., connectors). Nodes of SGG productions can correspond 
to places or transitions (i.e., elements), or arcs (i.e., connectors). Nodes that cor- 
respond to purely abstract symbols are introduced to facilitate the application 
of the rules, as illustrated in the following examples. Edges represent relations 
among elements. For example, an edge can connect a node corresponding to a 
transition to a node corresponding to an arc to define the source of the arc. 
Both nodes and edges are typed to indicate their role. Customization rules re- 
quire syntactically correct models 

ASGG productions build an abstract representation of SA models. SGG pro- 
ductions define the semantic representation, that is, the corresponding HLTPN. 
For example. Figure 2 shows the abstract and semantic representations of the 
interfaces of process Dispense Hot Drinks of Figure 1. The abstract interface 
comprises a node of type PMarker (P) and two sets of nodes of type Input (I) 
and Output (0). The node of type P is a purely abstract symbol that relates all 
syntactic elements that belong to the process through edges of type belong to 
(b). The nodes of types I and 0 represent input and output ports, respectively-^. 
The corresponding semantic representation of the process interfaces comprises a 
node of type PMarkerS (P) and two sets of nodes of type Input (In) and Output 
(Out). Nodes of type In and Out are places that model input and output ports. 
The node of type P is analogous to the corresponding node in the abstract graph 

^ Customization rules can check for syntactic correctness, but they would be more 
complex. 

^ Abstract nodes are added by applying ASGG productions. They do not belong to 
user-defined models. 
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and relates all nodes that define the semantic representation of the process with 
edges of type belong to (b). 





(a) Abstract representation (b) Semantic representation 

Fig. 2. Abstract and semantic representations of the interfaces of process Dispense 
Hot Drinks 



A complete representation of a process would require the internals to be 
formalized, that is, how the process consumes values from its input ports and 
produces values on its output ports. The rule shown in Figure 3 indicates how 
to model input consumption. In this case, we choose to model a process that 
always consume values from all its input flows (process Dispense Hot Drinks 
of Figure 1 would produce only capuccino). 

The two productions are represented as graphs. Each production is composed 
of three parts that correspond to the three graphical regions separated by a 
Y, as proposed in [14]. The left-hand side graph indicates the subgraph to be 
substituted by applying the production. The right hand-side graph indicates the 
graph to be added. The edges between left- and right-hand side graphs, through 
the top graph, indicate the connectivity of the added subgraph with the host 
graph. Each node is associated with a unique identifier: Nodes with the same 
identifier in both the left- and right-hand side of the production are preserved 
while applying the production. 

The ASGG production (Figure 3(a)) applies to an abstract node of type 
P, which belongs to the left-hand side graph, and preserves it, since it belongs 
to both the left- and right-hand side graphs. The production adds a node of 
type InCon (IC) and an edge of type b. It adds also an edge of type connect 
(c) between the added IC node and each I node that is connected to the P 
node through a b edge. These connections are defined by means of the chain 
that connects the left- and right-hand sides through the top graph. The textual 
attributes indicate that the newly created node (node 2) has three attributes: 
name, type and action. The value of attribute name is the concatenation of the 
name of the P node (node 1) with string IC; the value of attribute type is InCon; 
the value of attribute action is provided externally (typically by the user). 
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2* name = €2*name€; 

2* type = '*Start'*; 

2* predicate = 

2* action = €2*action€ 
2 tMin = "enab'*; 

2 tMax = "enab + oo"; 
2*absNode = €2*name€; 



X 






A 


da 


Start 


1 Start 1 


1^ 


1^ 



1* predicate = 1* predicate * "AND'* * 
2* name * "!= NIL"; 

(a) ASGG production (b) SGG production 

Fig. 3. Sample customization rule 



2*name = l*name * 
2. type = "InCon"; 
2*action = ?; 
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The corresponding SGG production is shown in Figure 3(b). It adds a node 
of type Start, that is, a transition, and two nodes of type arc {/^) between the 
newly created Start node and all In nodes that belong to the semantic repre- 
sentation of the process. The number of In nodes to be connected to the Start 
node cannot statically be determined: Different processes can have a different 
number of input flows, thus a different number of arc nodes should be added. 
The simple Y rule used for the A SGG production can add only a flxed number 
of nodes. To add a variable number of nodes we need programmed productions. 
A programmed production is given as a main production and a set of subpro- 
ductions. Special edges, hereafter p-edges, represent subproduction invocations 
and are drawn using dashed lines. Subproductions are characterized by p-edges 
in the left-hand side graph. All invoked subproductions must be applied to com- 
plete the application of the production: They are invoked depth-first according 
to the declaration order. The programmed production of Figure 3(b) comprises 
only a subproduction. The main production applies to a P node. It adds a Start 
transition, an edge of type b, and a set of double arc (da) p-edges: One for 
each In node b-connected to the P node in the target graph. The subproduction 
is invoked for each added da p-edge, and substitutes it with a pair of arc nodes. 
The edges of type p, t, and a between Start, In, and arc nodes indicate the di- 
rections of the added arcs. The textual annotations indicate the attributes of the 
created elements and their values. A reference between a pair of @ indicates the 
value of an attribute of the corresponding A SGG production. Attribute absNode 
is used to set the correspondences between semantic and abstract nodes. 

The application of the ASGG production to the P node of Figure 2(a) is 
shown in Figure 4(a). The abstract representation has a single IC node, which 
is linked to all I nodes. The effect of the SGG production on the graph of 
Figure 2(b) is shown in Figure 4(b). The semantic representation has a Start 
transition connected to all In places. Two arcs from/to In places are required 
to avoid accumulation of tokens: Any time a token is added to these places, the 
old one is overwritten. 





(a) Syntactic representation (b) Semantic representation 



Fig. 4. Input consumption for process Dispense Hot Drinks 




200 



Luciano Baresi and Mauro Pezze 



The rule illustrated in Figure 3 models the simultaneous consumption of all 
input values. The rules that correspond to other interpretations are discussed in 
Section 4. 

3.2 Consistency Frame work 

The consistency framework defines a hierarchy of sets of customization rules. 
Each customization rule defines an interpretation for a feature; a set of rules 
defines a set of alternative interpretations for the same feature. The hierarchy 
indicates the scope of each rule, and the subsets of rules that can be selected for 
defining a particular interpretation. The modification to a rule does not affect 
rules higher in the hierarchy. A feasible interpretation is obtained by selecting 
at most one rule from each set. A path between each selected rule and the root 
must be guaranteed, that is, if no rule is selected from a set S of rules, no rule 
can be selected from the sets that are connected to set Axiom only through paths 
that include S. 

The consistency framework for SA is presented in Figure 5. Set Axiom of 
Figure 5 contains the rule that creates a new model by adding a marker that 
identifies the context diagram. Modifications to the axiom could affect all rules. 
Set Diagram contains the rules that add a new diagram marker and build the 
model hierarchy. Ignoring set Diagram does not compromise any formalization, 
but it leads to flat models (e.g., SA models a-la De Marco [18]). 

Set Process contains the rules for adding processes in the model. They add 
a process marker (a node of type P referring to Figure 2). Sets Process Input 
and Process Output, which follow set Process, contain the rules that add input 
and output ports to the process marker thus defining the data interface of the 
process. Sets Input Consumption and Output Production contain rules that give 
semantics to the consumption or production of input or output values, respec- 
tively (one of the rules in set Input Consumption is illustrated in Figure 3). Set 
Process Execution contains the rules that combine input consumption and out- 
put production to give semantics to process execution. Ignoring one of these sets 
of rules would compromise the formalization. For example, not considering set 
Output Production would lead to processes without outputs and, thus, without 
execution semantics. 

Rules in the sub-graph rooted in node Terminator include similar rules for 
terminators. The hierarchies rooted in sets Flow and Store contain the rules 
for adding flows and stores. Such hierarchies depend on all interface rules of 
processes and terminators. 



4 Example Behaviors for Process Elements 

The informality of SA allows users to associate each element (node) identified in 
Figure 5 with several different interpretations. For space reasons, in this section 
we dwell only on process behaviors; interested readers can refer to [2] for a 
complete formalization of SA. 
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Fig. 5. Consistency Framework for Structured Analysis 



Recalling the alternative behaviors for process Dispense Hot Drinks of Sec- 
tion 2, we can characterize process semantics by means of: 

— The number of consumed inputs: Different interpretations consider processes 
that consume data from all, exactly one, any subsets of, or some subsets of 
their input flows; 

— The number of produced outputs: Different interpretations consider processes 
that produce data on all, exactly one, any subsets of, or some subsets of their 
output flows; 

— The execution duration: Different interpretations consider processes that ex- 
ecute instantaneously or not. 

The way processes manage unused input values depends on the number of 
input flows that are actually used by the process. If the process always read data 
from all its input flows, it could use some data and simply remove the others. In 
contrast, if the process read data from a subset of its input flows, it may preserve 
unused values for subsequent executions. 

The abstract representation of process interfaces are defined by rules at level 
Main Element and Interface of Figure 5. Interfaces do not vary with respect 
to the different interpretations that can be associated with the consumption 
of input values, the production of output values, and the duration of process 
execution. Rules at level Internals of Figure 5 describe the different semantics. 
In this section we focus on the consumption of input values. In particular, we 
discuss the sets of rules for defining the interpretations of process Dispense Hot 
Drinks sketchily introduced in Section 2. 

The behavior that always consumes all input value has already been presented 
in Figure 2; alternative interpretations are summarized in Figure 6. Besides 
reading from all input flows, we formalize also the possibilities of reading from 
single flows, from any combination of flows, and from particular combinations 
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(a) Consume data from one input flow at a time 





(b) Consume data from one of all possible subsets 





(c) Consume data from one of some subsets defined by the user 

Fig. 6. Several interpretations for consuming input values for process Dispense Hot 
Drinks 
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of flows. All abstract representations include an IC node for each possible way 
of consuming values from input flows. Each IC node is connected to all input 
ports that carry the data to be consumed. The semantic representations include 
a Start transition for each possible way of consuming values from input flows. 

The interpretation of Figure 6(a) describes a process that consumes exactly 
one input value at a time (process Dispense Hot Drinks would produce only 
milk or coffee). The abstract representation of Figure 6(a) has an IC node for 
each I node. The semantic representation has a Start transition for each In 
place. Each Start transition is connected to a different In place through a 
pair of inverted arcs. The interpretation of Figure 6(b) describes a process that 
can consume values for any non-empty subset of input flows (process Dispense 
Hot Drinks would produce either milk or coffee, or capuccino). The abstract 
interpretation comprises an IC element for each possible non-empty subset of 
process inputs. The semantic interpretation comprises a Start transition for 
each possible non-empty subset of In places. The interpretation of Figure 6(c) 
describes a process that can consume values for some non-empty subset of input 
flows as indicated by the user (process Dispense Hot Drinks would produce 
only milk or capuccino, but not coffee only). 

Rule Add Input Consumption - exactly one, which implements the solution 
of Figure 6(a) (from one input at a time), is given in Figure 7. The ASGG 
production adds an IC node for each input port. The main production adds an 
all inputs (ai) p-edge for each input port. The added ai p-edges invoke the 
subproduction to add as many IC nodes as required by the current number of 
input ports. The SGG productions similarly add a set of Start transitions and 
connect each of them to the corresponding In place through a pair of inverted 
arcs. The pairs of inverted arcs are added with an additional subproduction 
analogous to the SGG subproduction shown in Figure 3. 

Rule Add Input Consumption - all subsets, which implements the solution of 
Figure 6(b) (from one of all possible subsets), is given in Figure 8. Both the 
ASGG and the SGG production comprise a main production and some subpro- 
ductions. The two main productions are the same as the ones for the former 
interpretation and are illustrated in Figure 7. The subproductions are given in 
Figure 8. The first ASGG subproduction is invoked by the ai sp- edges added 
by the main production. It adds an IC node between each pair of P and I nodes 
identified by the ai sp-edge. It also adds an other input consumptions (oic) 
sp-edge between the selected I node and all other input consumptions (the IC 
node in the embedding). The second ASGG subproduction is invoked by the oic 
sp- edges added by the first subproduction. It adds a new IC node for each pair 
of I and IC nodes identified by the oic sp-edge. It connects the added IC node 
to the selected I node and to all the inputs of the target IC node. Analogously, 
the SGG subproductions add a Start transition for each subset of In places, 
and suitably connect them to the related In places using pairs of inverted arcs. 
The pairs of inverted arcs are added by means of a third subproduction that is 
analogous to the subproduction already shown in Figure 3. 
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3* name = l*name * 2* name * '*XC‘; 
3* type = "InCon'*; 

3* action = ?; 



3* name = €3*name€; 
3*type = '*Start'*; 

3* predicate = 

3*action = €3*action€; 
3*tMin = "enab'*; 

3*tMax = "enab + oo"; 
3*absNode = €3*name€; 



(a) ASGG production (b) SGG production 

Fig. 7. Rule Add Input Consumption - exactly one: The process consumes a value from 
exactly one input flow 
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3* name = 
3*tjpe = 
3 * action 
1 * cnt += 



l*name * 2. name 
"InCons'* ; 

= 7 - 

1 ; 



1 * cnt ; 




3* name = €3*name€; 

3*tjpe = '*Start'*; 

3*predicate = 2*name * '*!= NIL"; 
3* action = €3*action€; 

3*tMin = "enab"; 

3*tMax = "enab + cx)"; 

3*absNode = €3*name€; 




4* name = 
4^tjpe = 
4* action 
1 * cnt += 



l*name * 2* name 
"InCons" ; 

= 7 - 

1 ; 



1 * cnt ; 



4* name = €4*name€; 

4*tjpe = "Start"; 

4*predicate = 2*name * "!= NIL"; 
4*action = €4*action€; 

4 tMin = "enab"; 

4.tMax = "enab + oo"; 

4*absNode = €4*name€; 



(a) ASGG production (b) SGG production 

Fig. 8. Rule Add Input Consumption - all subsets: The process can consume values from 
any combination of input flows 
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Rule Add Input Consumption - some subsets, which implements the solution of 
Figure 6(c) (from one of some user-defined subsets), is given in Figure 9. The rule 
adds an IC node (ASGG production) and a Start transition (SGG production) 
for each identified subset of inputs. A straightforward rule - not presented here 
- connects the added IC nodes (Start transitions) to the corresponding I nodes 
(In places). 





2. name = l.name * l.cnt; 
2.tjpe = "InCons'*; 

2. action = ?; 
l.cnt += 1; 



2. name = €2.name€; 
2.tjpe = '*Start'U 
2. predicate = 

2. action = €2.action€ 
2 tMin = "enab'U 
2 tMax = "enab + cx)'*; 
2.absNode = €2.name€; 



(a) ASGG production (b) SGG production 

Fig. 9. Rule Add Input Consumption - some subsets: The process can consume values 
from some user-defined combinations of input flows 



The formalization of a process is completed by defining how output values 
are produced and by indicating how input consumption and output production 
define process executions. The interpretations and corresponding rules for output 
consumption are analogous to interpretations and rules for input consumption 
discussed above. Process executions require that IC and output production 
(OP) nodes be properly connected. At the semantic level, the corresponding Start 
and End transitions must be connected as well. The simplest semantics consists 
in assuming instantaneous atomic execution for processes. This interpretation 
can be implemented by simply collapsing all pairs of IC and OP nodes to obtain 
Execution (EX) nodes, and pairs of Start and End transitions to obtain Exec 
transitions. 

Some interpretations assume process execution to be interruptible. This se- 
mantics can be implemented by introducing an internal state between the be- 
ginning and the end of a process execution. The new state can be modeled using 
two places Idle and Exec that indicate the idle and execution states, respec- 
tively. An event that interrupts the execution can be modeled using a transition 
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that moves the token from place Exec to place Idle. Interruptible and nom 
interruptible semantics can be extended with duration of process execution by 
simply modifying the timing functions of Exec or Start and End transitions. 



5 Conclusions 

This paper presents an approach for formally defining Structured Analysis as a 
notation family, that is, as an informal specification notation that can be given 
many different interpretations. The approach is based on customization rules, 
which are given as pairs of attributed programmable graph grammars, and a con- 
sistency framework, which constrains the rules in a hierarchical framework. The 
rules allow the formalization of different interpretations for the same notation 
element. The hierarchical framework indicates complementary rules and allows 
the substitution of subsets of rules depending on the chosen interpretation. The 
presented formalization approach is supported by a prototype environment that 
supplies users with a general-purpose rule interpreter and an executor/analyzer 
for HLTPNs. The environment can easily be integrated with special-purpose 
editors or CASE tools to fully support a specific notation. The proposed for- 
malizations of Structured Analysis have been validated by designing all required 
customization rules ([2]) and integrating the interpreter with two commercial 
CASE tools: StP [16] and ObjectMaker [19]. The approach has been success- 
fully applied to other notation families from different domains: An extension to 
Structured Analysis for design specifications [1], Lemma [3], a new notation for 
specifying medical diagnostic processes, and FED (Function Block Diagram [6]), 
a graphical notation for designing programmable controllers. Currently, we are 
doing experiments for applying the approach to UML ([5]) and other similar 
object-oriented notations. 
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Abstract. Diagrams that serve as a visual input facility for program- 
ming environments have to be translated into some kind of semantic 
description. This paper describes such a method which is based on a 
specification of the translation process. The translation process starts 
with a diagram, which is simply represented as a collection of atomic dia- 
gram components, and it ends up with some data structure as a semantic 
representation of the diagram. The specification of the translation pro- 
cess mainly consists of two parts: the specification of spatial relationships 
between atomic diagram components in terms of their numeric param- 
eters (e.g., position, size), and an attributed hypergraph grammar that 
describes the concrete diagram syntax as well as the rules for generating 
the semantic representation. 



1 Introduction 

Diagram languages which are used for visual programming are formal languages 
and are thus defined by their syntax, semantics, and pragmatics. Syntax de- 
scribes atomic components of the language and the rules how they can be ar- 
ranged to make up valid sentences. Semantics describe the meaning of diagrams, 
i.e., the behavior of a computer when such diagrams are “executed”, and prag- 
matics consist of the context where sentences of this language are used. One issue 
of pragmatics is to “draw” diagrams using a specific graphical editor and then to 
translate these “drawings” into a representation which is appropriate for some 
kind of compiler, interpreter, or virtual machine^, e.g., diagrams that represent 
visual programs are first translated into an equivalent textual program which is 
then translated by a common compiler into machine code. This task requires a 
graphical editor that “understands” sentences of the specific language which it 
is designed for. Otherwise it is merely a drawing tool. “Understanding” means 
that the editor has to be able to check the drawings’ syntax and to transform 
{^Hranslate^^ ) diagrams into a representation which is required by the compiler. 

This paper describes a grammar-based method for such a syntax check and 
translation process: it starts with a diagram (e.g., created with a graphical ed- 
itor), that consists of a spatial arrangement of atomic components, and ends 

^ Eor brevity, we will use the term “compiler” in the rest of this paper as a represen- 
tative of all possible further processing steps. 

M. Nagl, A. Schiirr, and M. Miinch (Eds.): AGTIVE’99, LNCS 1779, pp. 209-224, 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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up with a semantic description, e.g., a program text for a common compiler 
for textual languages, but it is not restricted to strings. The syntax check and 
translation process for a concrete diagram language is determined by two speci- 
fications: 

1. The scanning procedure constructs a hypergraph model for the initial dia- 
gram. It is controlled by a spatial relationship specification which describes 
meaningful spatial relationships between diagram components. 

2. An attributed hypergraph grammar specifies the syntax of these hypergraph 
models and, as a consequence, of the diagram language. The grammar fur- 
thermore describes the relationship between hypergraph models and its se- 
mantic descriptions. Based on this grammar, a parser checks the diagram 
syntax and translates the diagram into its semantic representation. 

This method is well suited to diagram languages with (hyper) graphs as an 
appropriate means of diagram representation and (hyper) graph grammars as 
syntax definition. At least using graphs (and therefore hypergraphs as a gen- 
eralized form of graphs; see Section 4) as an intermediate representation does 
not impose a strong restriction on the class of diagram languages which can be 
processed by this method since graphs can be used as abstract representation 
for a wide variety of visual languages [7] . 

The rest of this paper is structured as follows: The next section introduces 
Ladder Diagram^ a widely used programming language for Programmable Logic 
Controllers (PLCs), which is used as running example in this paper. Section 3 
summarizes related work, and Section 4 briefly introduces graphs, hypergraphs, 
and hypergraph grammars. The translation process is described in Section 5. 
Section 6 gives a brief survey of DiaGen^ a framework for creating graphical 
editors, where the approach of this paper is incorporated to generate front-ends 
for common execution environments. Section 7 concludes. 

2 Example: Ladder Diagram 

Throughout this paper, we will use ladder diagrams as running example. Ladder 
Diagram is a visual programming language for Programmable Logic Controllers 
(PLCs) which has become part of the lEC 1131 standard [1]. Ladder diagram 
has been derived from schematic diagrams of relay controls where each relay 
is energized by a network of switches, either input or relay switches. Relays 
have been replaced by boolean values in PLCs, and networks of switches have 
been replaced by boolean expressions. However, ladder diagrams still allow to 
program a PLC like a relay control: boolean values which are defined by boolean 
expressions are drawn as relay coils, boolean input values are drawn as switches. 
The boolean complement of a value is drawn as a normally closed switch. Ladder 
diagrams allow to build networks that contain switches that are connected in 
series (boolean and-operation) and in parallel (boolean or-operation). 
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Fig. 1. A sample ladder diagram with an additional function block of type TB. 



Figure 1 shows a sample ladder diagram^ which controls three vents. An 
LED shall indicate the states of the vents. A lighting LED indicates at least 
two working vents. The LED blinks if at least two vents fail. An additional 
alarm is triggered if all three vents fail. Eigure 1 shows the boolean values and 
their controlling boolean expressions. E.g., the top-most sub-diagram shows that 
Two_vents is defined as the result of a parallel connection where each branch 
consists of a series connection of two switches. An equivalent boolean expression 
for this sub-diagram is Two_vents := (Vl A V2) V (Vl A V3) V (V2 A V3). The third 
sub-diagram, that defines No.vent , uses normally closed switches which represent 
boolean complements of the represented boolean values. Eigure 1 actually uses 

This example is taken from [16]. 
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an extension of ladder diagrams: As allowed in lEC 1131, ladder diagrams can be 
combined with function blocks which offer some complex functionality as black 
boxes with a certain number of inputs and outputs. The inputs may be boolean 
values, numeric values or any other type of value which is supported by the 
respective PLC. The fourth sub-diagram of our example makes use of a function 
block Timer of type TB which implements an oscillator. The inputs Thigh and 
Tlow control the up and down time of the oscillation. The Enable input allows 
to switch the oscillator on and off. The ladder diagram specifies that the LED is 
lighting continuously if Two .vents is true^ or blinking, if Two .vents is false and 
No .vents or One.vent is true. 

This paper defines the visual syntax of ladder diagrams with embedded func- 
tion blocks. Boolean values may be combined by and-operations (parallel con- 
nection) and or-operations (series connections). Networks are drawn from left to 
right. They always start at the left vertical border-line or at a function block 
output. Networks end either at a coil (boolean output) or a function block input. 
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Fig. 2. The semantics of the ladder diagram of Eig. 1 written as Instruction List. 



Ladder diagram semantics are defined by the behavior of the PLC that is pro- 
grammed by a specific ladder diagram. In this paper, we will use Instruetion List 
(IL), a textual, assembler-like language for PLCs which is also part of lEC 1131. 
The machine model behind IL is a accumulator-machine with additional stack 
and one-address commands. Possible commands are “LD” , “AND” , and “OR” which 
take a boolean variable as operand. Modifiers to these commands are “N” , which 
negates the operand’s value, and “(” which starts a new sub-expression and 
pushes the old value onto the stack. The command “)” finishes the subexpres- 
sion and combines its value with the top of the stack. Eunction blocks are called 
by the “CAL” command which also specifies the inputs. Eunction block outputs 
are referred to by a dotted name like Timer. Q in our example. IL furthermore 
contains many other commands, but these are not relevant for this paper. Eig- 
ure 2 shows the IL program that describes the semantics of the ladder diagram 
of Eig. 1. 



Creating Semantic Representations of Diagrams 213 



3 Related Work 

Many authors have described semantics of visual languages. In most cases 
(e.g., [9]), however, they restrict to specific visual languages. Others take an 
algebraic view of modeling picture semantics [25]. Work that is most closely 
related to this paper is Erwig’s definition of visual language semantics using 
abstract syntax graphs [7] and the separation of concrete and abstract syntax 
proposed in [2,17]. 

Erwig uses abstract syntax graphs that abstract from representation details 
of concrete diagrams. He does not restrict semantic definition to this representa- 
tion, but uses different schemes, e.g., denotational semantics, to define diagram 
semantics based on abstract syntax. However, he does not offer a method for 
translating a concrete diagram into its abstract syntax representation. 

Rekers et al. have proposed to use spatial relationship graphs (SRGs) to 
represent a diagram’s concrete syntax and an abstract syntax graph (ASG) for 
its abstract syntax [2,17]. The syntax of each of the graphs is represented by a 
graph grammar. By coupling both grammars, they are able to translate SRGs 
into ASGs and vice versa. The correspondence between ASG and SRG is rep- 
resented by special edges connecting ASG nodes by corresponding SRG nodes. 
The approach which is described in this paper uses hypergraphs and hypergraph 
grammars to describe concrete syntax and arbitrary data structures for seman- 
tic representations. Hypergraphs seem to offer a more natural representation 
of diagram components that have different “attachment areas” which link to 
other diagram components (consider connection points of a transistor symbol in 
schematic diagrams of electric circuits). Moreover there are restricted, yet pow- 
erful types of hypergraph grammars that allow for efficient parsing [3,12] which 
are not available for plain graph grammars. Arbitrary data structures for seman- 
tic representations, e.g., strings, that are used in this paper, have the advantage 
that they can be customized for common compilers. 

VLGG (Visual Language Gompiler-Gompiler) [6] is a tool whose approach is 
related to the approach in this paper. VLGG creates parsers for visual languages. 
Similar as with textual parsers (e.g., generated by yacc), semantic actions are 
used to create semantic representations of visual sentences. VLGG depends on po- 
sitional grammars which basically extend string grammars by 2D-relationships. 
Its parser is based on LR-parsers used for context-free string grammars which is 
rather restricted compared to hypergraph parsers which are used in this paper’s 
approach. 

4 Hypergraphs and Grammars 

Before the translation process is described in the next section, we will briefly 
introduce the notion of graphs, hypergraphs, and hypergraph grammars as used 
in this paper. 

Each (directed) graph consists of a set of labeled nodes and a set of labeled 
(directed) edges. Each edge visits two nodes which need not be different. Hyper- 
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graphs are generalizations of directed graphs: they have a set of labeled hyper- 
edges instead of edges. Each hyperedge has a fixed number of labeled tentacles 
which is determined by the hyperedge’s label. Tentacles connect the hyperedge 
with nodes visited by the hyperedge. A regular directed graph is a hypergraph 
where each hyperedge has two tentacles with labels source and target. Nodes 
will be represented by black dots, (directed) edges by arrows, and hyperedges 
by boxes containing the hyperedge label. Thin lines or arrows are used to repre- 
sent tentacles connecting the hyperedge with visited nodes. Tentacle labels are 
omitted where possible. 

Hypergraph grammars are similar to string grammars. Each hypergraph 
grammar consists of two sets of terminal and nonterminal hyperedge labels 
and a starting hypergraph which contains nonterminally labeled hyperedges only. 
Syntax is described by a set of productions of the form L ::= R with L (left- 
hand side, LHS) and R (right-hand side, RHS) being hypergraphs. A production 
L ::= R is applied to a (host) hypergraph H by finding L as a subgraph of H 
and replacing this match by R obtaining hypergraph Hh We say, H' is derived 
from H (written H H^) in one step. The grammar’s language is then de- 
fined by the set of terminally labeled hypergraphs which can be derived from 
the starting hypergraph in a finite number of steps. 

There are different types of hypergraph grammars which impose restrictions 
on a production’s LHS and RHS as well as the allowed sequence of deriva- 
tion steps. Context-free hypergraph grammars are the simplest ones: each LHS 
has to consist of a single nonterminally labeled hyperedge together with the 
appropriate number of nodes. Application of such a production removes the 
LHS hyperedge and replaces it by the RHS. Matching node labels of LHS and 
RHS determine how the RHS has to fit in after removing the LHS hyperedge. 
Productions Fi . . . R 24 of Eig. 7 are context-free ones. Context-free hypergraph 
grammars with embeddings are more expressive than context-free ones. They ad- 
ditionally allow embedding productions which consist of the same LHS and RHS, 
but with an additional (“embedded”) (sub-) hypergraph on the RHS, i.e., this 
hypergraph is embedded into the context provided by the LHS when applying 
such a production (production P 25 of Eig. 7; the gray edges represent the em- 
bedding context). Parsing algorithms and a more detailed description of both 
grammar types can be found in [12,3]. 

In the following, we will use (hyper) graphs as diagram representations (as 
spatial relationship hypergraph SRHG and hypergraph model HGM). These 
graphs can be extended by geometric attributes, e.g., representing exact posi- 
tions in the plane. This additional information is omitted here, but it is clear 
that using graphs does not impose any loss of information. Using graphs has the 
advantage (e.g., compared to relational structures) that they explicitly repre- 
sent items and relationships between items which makes this information read- 
ily available. Eurthermore, graphs offer a wide variety of graph algorithms for 
further processing and graph grammars for defining graph classes and their struc- 
ture. However, using graphs also has the disadvantage that making relationships 
explicit can lead to rather big representations. 
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5 The Translation Process 

Fig. 3 shows the three steps of the translation process and the resulting hy- 
pergraphs with increasing abstraction level. These steps are described in the 
following. 




spatial relation- Hypergraph Semantic 

ship hypergraph model value 



Fig. 3. Translating a diagram into a semantic representation. 



5.1 Scanning 

A diagram consists of a set of diagram components (transistor and resistor sym- 
bols etc. for schematic diagrams of electronic circuits, switches, coils, function 
blocks, and lines in ladder diagrams) with spatial relationships between them. 
In general, each component has a certain number of attachment areas which are 
somehow linked to attachment areas of other components. The way how these 
areas may be linked depends on the types of related components. For schematic 
diagrams of electronic circuits, each symbol has its connectors as attachment 
areas. Actually each connector can be linked to any other connector. Compo- 
nents of ladder diagrams have the following attachment areas: switches and coils 
have their left and right contacts as their attachment areas. Lines can manifest 
spatial relationships at their end points as well as at the lines itself; lines have 
these three attachment areas. Function blocks have as many attachment areas 
as they have inputs and outputs. Finally, the left and right lines of a ladder 
diagram are considered as a special chassis component with the lines as its at- 
tachment areas. However, only some relationships between different attachment 
areas make sense. E.g., direct relationships between a function block output and 
a coil contact does not make sense in ladder diagrams. 

A spatial relationship hypergraph (SRHG) is used to explicitly represent com- 
ponents and their relationships: Each component together with its attachment 
areas is represented by a hyperedge and some nodes that are visited by the 
hyperedge through its tentacles, which thus identify the attachment areas. Spa- 
tial relationships are represented by hyperedges (in general regular edges), too. 
Nodes are connected by such edges if the corresponding attachment areas are 
appropriately linked. 

Eor ladder diagrams, we have component hyperedges for the chassis consist- 
ing of the left and right vertical line (type “chassis” , 2 tentacles “Left” and 
“Right”) and vertical and horizontal lines (type “vline” and “hline” with 2 ten- 
tacles “LineEnd” and 1 tentacle “Line”). Eurthermore, we have normally open 
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as well as normally closed switches and coils (type “openContact” , “closedCon- 
tact”, and “coiF, 2 tentacles “Contact”). In this paper, we restrict the set of 
possible function blocks to TB function blocks (see Fig.l with three inputs and 
one outputs; type “TB”, 3 tentacles “Input”, 1 tentacle “Output”). Finally, we 
have textual values like T#300ms in Fig. 1 (type “text”, 1 tentacle “Text”). 

Possible relationships between such components are: lines can be related to 
each other by, e.g., connecting at their end points or by intersection of line’s end 
point (LineEnd “conn” LineEnd, LineEnd “conn” Line, or Line “conn” Line). 
Lines can be related to the left and right line of the chassis (LineEnd “conn” Left, 
LineEnd “conn” Right), to switches and coils (LineEnd “conn” Contact), to texts 
(LineEnd “conn” Text) as well as function block inputs (LineEnd “input” Input) 
and outputs (Output “output” LineEnd). Fig. 4 shows the resulting SRHG for 
the ladder diagram of Fig. 1 that defines LED. “Conn”, “input”, and “output” 
edges are drawn as gray arrows, the other edges are depicted as rectangles with 
arrows as their tentacles that connect to black dots as nodes. 




Fig. 4. The spatial relationship hypergraph of the ladder sub-diagram of Fig. 1 
that defines LED. 



The SRHG is the result of the scanning step and has to be created by a 
scanning procedure, which has to be provided with a specification of the kinds 
of diagram components, attachment areas, and how attachment areas can be 
related. Relationships between attachment areas are constrained by conditions 
on parameters of the attachment areas. Table 1 shows these constraints for lad- 
der diagrams: the table assigns a constraint to each pair ni,n 2 of nodes and 
each relation between the corresponding attachment areas. The constraint im- 
poses a condition on the attachment areas’ parameters. Accessible parameters 
are pi and p 2 for positions of n\ resp. U 2 (e.g., line end points) as well as y- 
coordinates yi and ^ 2 - Constraints of Tab. 1 furthermore make use of Java- like 
methods intersects and area that check whether a line intersects an other line 
or a rectangular box resp. computes a box of a certain size around a point. 
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Table 1. Spatial relationships for ladder diagrams 



The scanning procedure essentially works as follows: 

1. For each diagram component, create an appropriate hyperedge together with 
its visited nodes, which are labeled according to the componenFs attachment 
areas. 

2. Check for any pair of nodes^ and any possible relationship type between 
those nodes whether the nodes’ parameters satisfy the constraints for this 
relation. If the constraint is satisfied, add a corresponding relationship edge. 

Checking each pair of nodes is quite inefficient (O(n^) where n is the number 
of nodes). Attachment areas which do not intersect in the plane are generally 
not related. A more efficient solution is to consider the rectangular bounding 
box of the attachment area of each node and to check only those pairs of nodes 
with intersecting bounding boxes. The complexity of this search is 0(n log n + 
where k is the number of intersections [11]. 

5.2 Reducing 

The SRHG which has been produced by the scanning step can now be used for 
syntax analysis. However, the situation is similar as for compilers for textual 
languages: the parser does not operate on the stream of characters directly. For 
efficiency reasons, this stream is preprocessed by the lexical analysis that removes 
unnecessary characters (e.g., comments) and combines elementary character se- 
quences to larger components (e.g., keywords). The same holds for the SRHG. 
Many spatial relationship edges are necessary to represent simple concepts. E.g., 
intersecting and connected lines in ladder diagrams represent the same boolean 
value of a boolean expression (or electric potential in the term of relay controls). 

^ We consider binary relationships in this paper. Since hyperedges are allowed as 
relationship edges, arbitrary n-ary relationships could be considered, too. 
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Therefore, it makes sense to reduce all connected lines with their representing 
nodes to a single node. In order to make parsing — the following step — more ef- 
ficient, we take a reducing preprocessing step that creates a hypergraph model 
(HGM) from the SRHG by representing “essential” SRHG-subgraphs by more 
abstract hyperedges and/or combined (“unified”) nodes. Each SRHG node and 
hyperedge that is not member of one of these “essential” subgraphs will not 
become member of the HGM. 

The essential diagram situations in ladder diagrams consist of single SRHG 
hyperedges only. Figure 5 shows these situations together with their SRHG rep- 
resentation and how these situations are represented in the HGM. Only “hline”, 
“vline”, and “conn” edges of the SRHG are really reduced; all the other edges 
are simply translated into equivalent HGM ones. For other diagram types (e.g., 
visual A-calculus VEX [5]) much more complicated reducing steps have to take 
into account subgraphs which consist of several SRHG edges and negative ap- 
plication conditions, i.e., context that must not occur when taking a specific 
reduction action. 

Fig. 6 shows the result of reducing the SRHG by the rules depicted in Fig. 5. 
Please note how much the SRHG (Fig. 4) has been reduced. 



5.3 Parsing 

The HGM which has been produced by first scanning the diagram and then 
reducing the obtained SRHG describes the diagram’s concrete structure. Syntax 
analysis of the diagram can thus be performed on the HGM. This step of the 
translation process checks the HGM according to a specified hypergraph gram- 
mar. As usual, syntax checking is performed by parsing, i.e., searching for a 
derivation sequence from the starting hypergraph to the HGM using grammar 
productions. For a survey of parsers which may be used in the context of visual 
languages, see [12,3]. 

Additionally to syntax checking, the parser has to create a semantic descrip- 
tion in the process of constructing a derivation sequence. The situation is similar 
to compilers for textual languages where nonterminal symbols and productions 
are extended by attributes resp. attribute evaluation rules which compute on 
the attributes when the production is used in the derivation [10]. This idea has 
already been adopted to graph grammars (e.g., [4,8]), and we make use of this 
idea of semantics definition together with syntax description: Each hyperedge 
may carry attributes, and productions are extended by attribute evaluation rules 
which compute attribute values when the corresponding production is used in 
the derivation. The term attributed hypergraph grammar^^ refers to a hyper- 
graph grammar which has been extended by attributes and attribute evaluation 
rules. 

Figure 7 shows the attributed hypergraph grammar for ladder diagrams. 
Nonterminally labeled hyperedges are depicted by rectangular boxes, terminally 
labeled ones by oval boxes. Productions are depicted in the abbreviated form 
L ::= Ri \ • • • \Rn if productions L ::= Ri, . . . , L ::= R^ have the same LHS L. 
The upper part of Fig. 7 shows the hypergraph grammar only. Node labels a, b. 
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Fig. 5. Reduction rules for translating spatial relationship hypergraphs of ladder 
diagrams into their hypergraph models. 



etc. describe how the RHS has to fit into the host hypergraph when the LHS has 
been removed. Figure 7 omits the productions’ application conditions that use 
positional attributes of the affected hyperedges in order to guarantee processing 
of boolean expressions from top to bottom. 

The lower part of Fig. 7 assigns program code as attribute evaluation rules 
to productions. Fach hyperedge carries, depending on its label, an attribute a 
which contains an intermediate semantic description of that sub-hypergraph that 
is represented by the hyperedge, 7 for generated IL program code, and ’name’, 
’in’, etc. which are defined by the diagram components itself. For readability, 
attributes are written in a more “mathematical notation” as where 

edge is the label of a hyperedge, index the index of the hyperedge (0 means the 
LHS hyperedge, 1 and 2 RHS hyperedges), and a the attribute of this hyperedge. 
Functions and, or, etc. have straight-forward implementations which are omitted 
in this paper. 
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Fig. 6. The hypergraph model of the ladder sub-diagram of Fig. 1 that defines 

LED. 



The grammar is a context-free hypergraph grammar with embeddings. Pro- 
ductions Ti . . . T 24 are context-free productions, T 25 is an embedding produc- 
tion: A block-edge is added between corresponding fb_in and fb.out edges. A 
plain context-free hypergraph grammar without embedding production would 
have been sufficient for the restricted ladder diagram language of this paper 
with only this simple function block type. However, function blocks with more 
than one output cannot be described by a context-free grammar. The grammar 
of Fig. 7 is easily extended for such function blocks with new productions similar 
to P 25 ' 

The translation step from a HGM to its semantic description is performed 
as follows: The hypergraph parser (see [12,3]) searches for a derivation of the 
HGM from the starting hypergraph using the HGM grammar. Attributes and 
semantic actions are neglected in this step. The derivation consists of a sequence 
of HGM productions which uniquely induces functional dependencies among the 
attributes of hyperedges that occur in the derivation. As for attributed string 
grammars, these dependencies have to be non-circular, i.e., there has to be a 
total ordering on all instances of dependencies such that each attribute can be 
computed from known values determined by earlier dependencies. This is the 
case in the grammar of Fig. 7. 



6 DiaGen 

This translation process is used in DiaGen for creating visual programming 
front-ends for further processing steps, e.g., for compilers from PLG Instruction 
List to machine code of specific PLGs. 

DiaGen consists of an editor framework and a generator. A formal specifi- 
cation of a diagram language serves as input for the generator which creates 
custom components that build — together with the framework — a graphical ed- 
itor customized for the specified diagram language. Main features, which have 
been described in [12,14,19,13], are: 
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Fi : Laddero.7 = Defsi.7 

P2 : Laddero.7 = FBsi- 7 ■ Defsi.7 

P 3 : FBso.7 = FBi.7 

Pa : FBso.7 = FBsi-7 ' FB1.7 

Rs : fbJni.a = Ori.a; 

FBo- 7 = fbJni.7 
Pq : fbJni.a = FbOri.a; 

FBo- 7 = fbJni.7 
P7 : fbJrii.a = alwaysTrue () ; 

FBo- 7 = fbJni.7 
Pg : Defso.7 = Defi.7 
P9 : Defso.7 = Defsi.7 • Defi.7 
Pio : Defo.7 = output (Ori .(T, outi .name) 

Fii : Defo.7 = output (FbOri. a, outi.name) 
P25 ■ fbJno.7 = fbCall(blocki.name, blocki.i 
blocki.pl, vari.va/tie, 
fb_outo.cr = blocki.name . blocki.otit 



P12 : Oro.cr = or(Ori.a, Andi.a) 

Pi 3 ■ Oro.cr = Andi.cr 

PiA : Oro.cr = or(Ori. cr, FbAndi. cr) 

Pi 5 ■ Oro.cr = or(FbOri .cr, Andi.cr) 

Pi 6 : FbOro. cr = FbAndi. cr 

Pi 7 : FbOro. cr = or(FbOri. cr, FbAndi. cr) 

Pi 8 : Ando. cr = Contacti. cr 

PiQ : Ando. cr = and(Ori.cr, Concati. cr) 

P20 : Ando. cr = and(Ori. cr, 

or(Or2. cr, Andi.cr)) 
P21 : FbAndo. cr = and(fb_outi. cr, Ori. cr) 
P22 : FbAndo. cr = fb_outi. cr 
P23 ■ Concato. cr = open(open^.name) 

P2A ■ Concato. cr = closed(closedi.name) 
a, fbJno.cr, 

blocki.p 2 , var2.va/tie); 



Fig. 7. Attributed hypergraph grammar translating hypergraph models of ladder 
diagrams into their semantic representation. 
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— Diagrams are internally represented by liypergraplis; a diagram language is 
thus a hypergraph language together with a mapping from hypergraphs to 
their visual representation as diagrams. 

— Nodes and hyperedges carry attributes, and each grammar production is 
augmented by layout constraints on attributes accessible in the production. A 
constraint-solver provides automatic, user-adjustable layout of diagrams [15]. 

— Diagrams can be edited in a syntax-directed manner. For the diagrams’ 
context-free share, transformations on derivation trees are used. Further 
transformations may modify the diagrams’ hypergraphs directly. To hide 
those details from the user, interactions of the user and the editor are de- 
scribed by certain interaction automata. 

— Free-hand editing is also supported. The user can arbitrarily add, delete, 
move, or modify parts of the diagram. The underlying hypergraph model 
is modified accordingly, a hypergraph parser distinguishes correct diagrams 
from incorrect ones by keeping the underlying hypergraph’s syntactic meta- 
structure up-to-date. Free-hand editing with parser support relaxes the need 
to specify a full set of transformations on diagrams for syntax-directed edit- 
ing since free-hand editing can be used for (yet) unspecified diagram oper- 
ations. Therefore, this editing mode enhances usability of editors and also 
makes rapid prototyping of diagram editors possible because — as an extreme 
case — specification of diagram operations can be omitted completely. 

The translation approach which has been described in this paper allows free- 
hand editing of diagrams which are then translated into its semantic description. 
DiaGen has been used to generate a ladder diagram editor from the specification 
which has been outlined in this paper. Figure 1 shows a screenshot of this editor. 
Fig. 2 the equivalent IL program that has been created as a semantic description 
for the depicted ladder diagram. This decription could be further processed, e.g., 
in a compiler that compiles the IL program into the machine code for a specific 
PLC. IL would then be the intermediate language in the processing of a ladder 
diagram into PLC machine code; the diagram editor with its semantic translation 
process acts as a compiler front-end, the translater from IL into machine code 
as the compiler back-end. 

7 Conclusions and Future Work 

This paper has presented a grammar based method for translating diagrams 
into a semantic description which then can be interpreted by a common inter- 
preter. Diagrams that are translated by this method have to be represented as a 
collection of atomic diagram components with appropriate numeric parameters 
representing their size, position, etc. in the plane. This method makes use of a 
specification of meaningful spatial relationships between diagram components, 
how diagrams are represented by hypergraphs, and an attributed hypergraph 
which specifies the diagram syntax as well as the way how the semantic descrip- 
tion is created. The concepts which have been described in the paper have been 
demonstrated for ladder diagrams, a widely used visual programming language 
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for Programmable Logic Controllers (PLCs). Ladder diagrams are translated 
into their corresponding Instruction List programs, a textual programming lan- 
guage for PLCs. 

The method that has been described on the previous pages is based on rep- 
resentation of diagrams by hypergraphs, which are a generalization of graphs. 
(Hyper) graphs appear to be an appropriate way to represent diagrams on differ- 
ent levels of abstraction. Furthermore, hypergraph grammars provide a powerful 
tool for describing diagram syntax as well as the translation process from the 
diagram into its abstract representation. 

This is not finished work. So far, it has been used “one-way” for translating 
diagrams into a representation which can be further processed by an execution 
environment (e.g., a PLC runtime system). This might be sufficient for this ex- 
ample. For diagrams that are translated into a semantic representation, which is 
then interpreted and creates results, that have to be translated back into the dia- 
gram language, the unparsing problem has to be solved. This unparsing problem 
requires parsing of the interpreter results and creating a diagram in correspon- 
dence with the derivation of the interpreter result. Triple graph grammars [18] 
appear to be a starting point for solving this problem. 
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Defining the Syntax and Semantics of 
Natural Visual Languages 

Dorothea Blostein 
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To bridge the gap between paper and electronic forms of documents, computers 
must be able to recognize and generate diagrams as well as text. Diagrams used 
in society are expressed in a variety of notations, which we call natural visual 
languages . Examples include notations used for mathematics, music, 
engineering drawings and architecture. These visual languages do not have 
fixed, formal definitions, but evolve through use in society. This paper 
examines the use of graph transformation in processing natural visual 
languages, describing the difficult problems in this domain, existing graph 
transformation work in this area, and competing methods. Many problems 
have not been adequately addressed by any technique. The use of graph 
transformation is appropriate, since the representation and manipulation of 
spatial and logical relationships is central to the computation. 

1. Introduction 

This paper characterizes the role of graph transformation in the processing of 
natural visual languages. Examples of natural visual languages include notations used 
in engineering drawings, architecture drawings, music notation, dance notation and 
mathematical notation. These are visual languages; they use spatial arrangements of 
symbols to convey information. They are also natural languages; their syntax and 
semantics are defined through use in society. To summarize the terminology; 

• a natural visual language is a two-dimensional language of symbols. Most natural 
visual languages are used to communicate specific types of information, such as a 
music composition, a circuit diagram, or an architecture plan. 

• a diagram is a drawing (a two-dimensional arrangement of symbols) expressed in 
some natural visual language. 

Computer processing of diagrams is needed to bridge the gap between paper 
documents and electronic documents. The “paperless office” is not becoming a reality; 
both electronic and paper documents are used in increasing numbers. 

To characterize the role of graph transformation in generation and recognition of 
diagrams, we examine the following questions; 

• What are the difficult problems in this domain? 

• How has graph transformation been used in this domain? 

• What competing methods are currently used? 

• What role can graph transformation play in the future? 

We conclude that graph transformation is useful for defining and implementing parts of 
a diagram processing system. To encourage this, efforts should be made to advertise 
graph transformation in the research communities that investigate diagram processing. 
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2. What are the difficult problems in this domain? 

Diagram-processing software consists of recognizers, editors, and generators (Figure 1). 
Usually editing and generation occur in a loop, whereas recognition is a one-time 
operation that is followed by manual correction of recognition errors. Presently, 
diagram generation technology is more advanced than recognition technology; many 
widely-used diagram generators are on the market, whereas most diagram recognizers are 
in the research stage. 



Generate 




Figure 1 A diagram needs to be treated both as an image (for display to the user) and as 
information (for intelligent editing, storing in a database). Generation and 
recognition software translate between the image and information, by using a 
priori knowledge of the visual language syntax and semantics. A WYSIWYG 
(What You See Is What You Get) editor automatically generates updated diagrams, 
whereas a batch editor, such as LaTeX, generates diagrams on request. 



A major challenge in diagram processing is to find appropriate means of capturing 
the syntax and semantics of a natural visual language. The following difficulties are 
encountered. 

• Diversity . There is a great diversity of visual languages [20]. For example, some 
languages use coordinate axes (time and pitch coordinates in music notation), 
whereas others have an underlying graph structure (flow charts and schematics). 
Diverse spatial relationships are used, including parallelism, proximity, 
containment, and alignment. 

• Informally defined syntax and semantics . Natural visual languages are informally 
defined, through common usage. Written descriptions of these languages tend to 
be informal, and present information by example, as in descriptions of music 
notation [36]. Language syntax can be quite complex, with numerous rules and 
exceptions to rules. 

• Dialects . Natural visual languages have dialects. A fixed language definition does 
not suffice for computer processing. Rather, a core language and variants must be 
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defined. Perhaps techniques used to process dialects of programming languages 
(COBOL, for example) can be adapted. 

• Diagram layout . The same information can be represented by a large set of 
diagrams. These diagrams differ in layout, in readability, and in aesthetic appeal. 
For example, changing the layout of a circuit diagram affects the appearance of the 
diagram without changing the information being conveyed. Petre states that 
“Much of what contributes to the comprehensibility of a graphical representation 
isn’t part of the formal programming notation but a ‘secondary notation’ of layout, 
typographic cues, and graphical enhancements that is subject to individual skill” 
[27]. The rules of layout are not well codified, making it difficult for a recognizer 
to exploit layout cues, or for a generator to produce diagrams that exhibit good 
layout. 

• Complex correspondence between syntax and semantics . Many visual languages 
have a complex correspondence between spatial constructs and information content. 
In general, it is not possible to define a mapping between a small portion of the 
notation and a small portion of the information that is conveyed. For example, 
large, widespread sets of symbols are used to represent pitch in music notation and 
dimensions in engineering drawings. 

• Irregular symbol placement . The language syntax must not only describe the ideal 
position for symbols, but must also characterize acceptable deviations in symbol 
placement. This is particularly true for hand-drawn diagrams. Irregularities in 
symbol placement make it difficult to determine the logical relationships among 
symbols from the observed spatial relationships. For example, in handwritten 
mathematical expressions, complex conditions are needed to determine operator 
range [5] [13] . 

• Errors and uncertainty in symbol recognition . Document images contain noise. 
As a result, symbol segmentation and symbol recognition are error-prone. 
Subsequent processing must deal with the following types of errors; incorrectly 
identified symbols, missing symbols, symbols that have been split into two, 
symbols that have been merged. Contextual information from the later processing 
stages can provide valuable feedback to symbol recognition. In some situations, 
diagram recognition software does not have to cope with noise. For example, a 
file with display-format data (postscript, framemaker) can be analyzed [25] [29], or 
symbols manually entered into a drawing tool can be analyzed. 



3. How has Graph Transformation been used in this domain? 

Existing diagram recognition systems that use graph transformation are 
summarized in Table I. Most of these are research prototypes; the table recognition 
work by Rahgozar and Cooperman became part of a commercial product. These 
systems are discussed further in [4]. Graph transformation has also been applied to 
diagram editing, as in [I] [12]. Hypergraph rewriting can be used as a uniform means 
of modeling diagram syntax and semantics [23], but is currently limited to small visual 
languages with straightforward mapping between syntax and semantics. 

Graph transformation is well suited to diagram processing. Graphs represent 
relationships explicitly and clearly. Both spatial relationships and logical relationships 
are conveniently represented. Graph transformation provides an intuitive means to 
construct and deduce relationships, and eventually to understand the information 
conveyed by the document image. Baumann finds a main advantage is that graph 
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productions define syntactic and semantic knowledge in a declarative way [2], In 
contrast, the previous procedural system was difficult to maintain because the 
knowledge of symbol shape and music syntax was distributed all over the code. 
Rahgozar and Cooperman use graph transformation to recognize the geometry and 
logical structure of tables [35], They cite as an advantage the ability to trade off 
generality and efficiency, due to the separation of entity recognition and graph 
rewriting. 



Table 1 Graph transformation applied to diagram recognition. 



Year 

published 


Type of 
diagram 


Authors 


Use of graph transformation 


Graph Trans. 
Engine 


1982 

[6] 


circuit 

diagrams 


Bunke 


Transform image-oriented graph 
to information-oriented graph 


Own 

implementation 


1988 

[9] 


machine 

drawings 


Dori 

Pnueli 


Web grammar to enumerate 
notations for dimensioning 


Own 

implementation 


1993 

[10] 


music 

notation 


Fahmy 

Blostein 


Transform image-oriented graph 
to information-oriented graph 


Own 

implementation 


1994, 95 
[28] [2] 


music 

notation 


Baumann 

Pies 


Transform image-oriented graph 
to information-oriented graph 


Own 

implementation 


1995, 99 
[13] [5] 


math 

notation 


Grbavec 

Blostein 

Schiirr 


Transform image-oriented graph 
to information-oriented graph 


Own / 
PROGRES 


1996 

[35] 


tables 


Rahgozar 

Cooperman 


Deterministic graph grammar 


Own 

implementation 


1997 

[19] 


math 

notation 


Lavirotte 

Pettier 


Deterministic graph grammar 


Own 

implementation 


1998 

[11] 


music 

notation 


Fahmy 

Blostein 


Generalized form of discrete 
relaxation 


Own 

implementation 


1999 

[38] 


math 

notation 


Smithies 

Novins 

Arvo 


Graph grammar with 
backtracking parser 


Own 

implementation 



4. What competing methods are currently used? 

Many formalisms have been investigated for use in describing visual languages. 
The extensive survey by Marriott et al. [21] discusses grammatical approaches (string 
grammars with generalized relations, graph grammars, and multiset grammars), logic- 
based approaches (definite clauses, constraint logic programming, formalization of 
topology, logic formalisms containing visual expressions), and algebraic approaches 
(algebraic formalisms defining picture domains, application domains, and their 
correspondence). This work has been most intensively tested on newly-developed 
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visual languages, particularly visual programming languages. These languages are 
defined in a way that makes them amenable to formal treatment (unambiguous, easy to 
parse, clear semantics attached to spatial relationships). In present form these 
approaches cannot handle natural visual languages. 

Research into diagram recognition is reported in various journals and conference 
proceedings, including [15] [31] [32] [33] [34]. Researchers in document image 
analysis have tried various approaches, including blackboard systems, schema-based 
systems. Hidden Markov Models, procedural code, syntactic methods, and graph 
transformation. Further details may be found in [3] . These recognition systems tend to 
be large and complex, and heavily tailored toward the particular visual language being 
recognized. Successful recognition has been achieved for particular domains. Further 
work is required to increase the performance, robustness, and versatility of these 
systems. 

Commercial software for diagram editing and generation is highly successful. 
However, little is published about the software structures and algorithms used - 
companies treat this as proprietary information. Certain aspects of the diagram 
generation problem, such as graph layout, have received extensive attention in the 
literature [30] [37] [39]. 

5. What role can graph transformation play in the future? 

Section 2 introduced a list of difficult problems in diagram processing. Existing 
systems address some of these problems, as summarized in Table 2. Further work is 
needed, as discussed below. 

Better processing of noisy data could be achieved by extending graph transformation 
to include probabilistic information. Probabilities can be attached to graph productions 
(a stochastic grammar, as in [7]) or to graph nodes and edges. Work on probabilistic 
and elastic graph matching is relevant [16] [22] [40]. 

To handle dialects of a visual language, graph transformation could describe a 
"pure" language, as well as allowable deviations. Tree transformation languages such 
as TXL have been successful in processing dialects of string languages [8]. 

Existing grammar and parsing methods should be extended to cope with the 
irregular symbol placement which occurs in both typeset and handwritten documents. 
Some progress has been made in math recognition [13]. 

There is need for a software architecture that integrates graph transformation 
modules with other parts of the implementation. Image-level operations such as noise 
reduction, skew removal, and symbol recognition [24] are typically implemented 
without graph transformation. If graph transformation is used in later stages of diagram 
recognition, it should provide contextual feedback to the early recognition stages. 
Relevant existing work includes a table recognition system in which a control strategy 
chooses whether to carry out a task by a grammar production or by an external 
recognition module [35]. 

Research is needed to clarify the relationship between graph transformation and 
competing methods. For example, multiset grammars and graph grammars are 
described as follows in a review by Marriott et al. ([21], p. 8) 

More generic approaches can be broadly separated into two types. Attributed 
multiset based grammars in which spatial relationships between symbols are 
implicit and can only be derived by computations involving the geometric 
attributes of the symbols , and edge-labelled graph grammars in which symbols do 
not have attributes, but rather the relationships between symbols are explicitly 
represented as edges in the graph which may be rewritten in the grammar. 
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Table 2 How current systems address the diagram-processing problems listed in Section 2 



Problem 


Graph Transformation 
Approaches 


Other Approaches 


Diversity 


Simple visual languages 
can be described [23] 


Proposed approaches make a start. For 
example, Pasternak’s drawing- 
interpretation kernel iteratively applies 
declarative geometrical constraints to 
combine graphical objects into higher- 
level objects [26]. 


Informally 
defined syntax 
and semantics 


Designers of existing systems primarily use introspection and 
testing to determine the syntax and semantics of the visual language. 
Some written descriptions exist, e.g. music notation [36] and math 
notation [17]. 


Dialects 




No systematic solution for diagram 
recognition. Diagram generators allow 
the user to select among several possible 
diagram styles. 


Diagram layout 




Diagram generation systems achieve 
good results using techniques such as 
constraints, or elasticity models [14]. 
Current diagram recognizers do not 
exploit much layout information. 


Complex 
correspondence 
between syntax 
and semantics 


No general solution. One 
approach is the use of 
transformation phases 
such as Build, Weed, 
Incorporate [10]. 


No general solution. 


Irregular 

symbol 

placement 


Irregularly placed math 
symbols can be handled; 
the interaction among 
graph productions is 
complex [13] [5]. 


Many recognition systems assume 
strong restrictions on symbol 
placement. A general solution is 
needed. 


Errors and 
uncertainty 


A generalized form of 
discrete relaxation has 
been used to find a 
consistent interpretation 
in a set of symbol 
candidates [11]. 


Various techniques are used [3], but a 
general solution is needed. A promising 
approach is to find the most likely 
interpretation of a document image, 
relative to a given Hidden Markov 
Model of image generation [18]. 



We conjecture that graphs are better than multisets in diagram processing. Diagram 
recognition requires complex computation to find the important spatial relationships. 
This is probably easier to accomplish when relationships are represented explicitly. 
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6. Conclusion 

It is difficult to describe the syntax and semantics of natural visual languages, such 
as music notation, mathematics notation, and engineering drawings. Graph 
transformation is a promising method for capturing language syntax and semantics, and 
thereby supporting diagram generation and recognition. Much work remains to be 
done. 

Efforts should be made to advertise graph transformation in the research 
communities that investigate diagram processing. We cannot present complete 
solutions, but we can show inspiring examples of graph transformation use. Bunke’s 
work [6] inspired us to start using graph transformation. As a result, our work [10] 
inspired several other researchers to use graph transformation in analysis of music 
notation [2] [28], tables [35], and math notation [19]. The latter inspired another math 
recognition system to use graph transformation [38]. These researchers were quite 
inventive in adapting graph transformation to meet their needs. Such lively 
development is likely to continue, provided that we make researchers aware of graph 
transformation as a possible computational technique. 
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Abstract Within this contribution GENGEois presented, a develop- 
ment environment for visual languages. GENGEDoffers a hybrid language 
for defining the syntax of visual languages consisting of an alphabet and 
a grammar. Correspondingly, the main components of GENGEDare given 
by an alphabet and a grammar editor. The syntax description is the in- 
put of a diagram editor allowing the syntax-directed manipulation of di- 
agrams. The grammar definition as well as the manipulation of diagrams 
is based on algebraic graph transformation and graphical constraint solv- 
ing. 

Keywords: visual language, algebraic graph transformation, constraint 
solving, rule- and constraint-based editor. 



1 Introduction 

Visual languages (VLs) are emerging in various application areas, compare for 
example [Shu88,Cha90,BGL95,Sch98]. Usually they are tightly integrated with a 
corresponding visual environment (VE). This is the main disadvantage when the 
concepts of a language or the visual notations are changed. Then a partial reim- 
plementation of the VE is necessary. These reimplementations are time consum- 
ing and costly. For this reason more and more generators of VEs are appearing 
whose input is mostly a textual syntax description of a specific VL. In general, 
visual statements are difficult to describe textually because of their graphical 
structure. On the other hand, visual descriptions are usually not sufficient to 
define all necessities [Sch98]. To avoid such problems we propose GenGEd, a 
development environment for VLs. GenGEd implements the visual definition of 
VLs, whereas the textual parts are concerned with graphical constraints and 
common datatypes like strings or integers. 

Within GENGEDthe syntax of a VL is defined by an alphabet and a grammar. 
Accordingly, the main components of GenGEd, as illustrated by Figure 1, are 
given by an alphabet and a grammar editor. The constraint-based alphabet 
editor allows for the definition of a VL-alphabet. Such an alphabet is a set of 
templates which can be instantiated to build diagrams. The grammar editor 
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then is used to construct more complex rules using simple insertion/deletion 
rules automatically generated from an alphabet. The user-defined VL-grammar 
is the input of a rule- and constraint-based diagram editor allowing the syntax- 
directed manipulation of VL-diagrams by applying grammar rules. 




Figured. The GENGEDcomponents. 



Usually a diagram comprises several symbols and connections between sym- 
bols. In Gen G Ed a diagram is represented by an attributed graph structure 
consisting of a logical level [abstract syntax) which is connected with the layout 
level [concrete syntax) by suitable operations [BTMS99]. On the implementation 
side, the logical level of a diagram is mapped onto an attributed graph where the 
nodes represent the symbols of the diagram and the edges the connections. Com- 
mon datatypes like strings or integers, which are also interpreted as symbols, are 
mapped onto the attributes of a node. Based on attributed graphs the manipu- 
lation of diagrams is supported by the integrated graph transformation machine 
Agg [TER99]. The layout level of a diagram is given by a set of graphics and 
graphical constraints. These constraints pertain mainly to positions and sizes 
of graphics. The solving of graphical constraints is supported by the integrated 
constraint solver ParCon [Gri96]. 

To make the work with Gen G Ed as easy as possible, the editors have a similar 
graphical user interface (GUI). So the GUI comprises mainly a structural view 
and a drawing area for the elements belonging to the VL-alphabet, VL-grammar 
or diagram. The structural view holds the (textual) logical level of symbols and 
connections which can be selected for further manipulation. 

This paper is organized as follows: In Section 2 the components of the al- 
phabet editor are discussed. The grammar editor and additionally the rule- and 
constraint-based diagram editor is topic of Section 3. Related work is presented 
in Section 4. Some concluding remarks will be made in Section 5. 

2 Alphabet Editor 

The alphabet editor supports the definition of a VL-alphabet consisting of sym- 
bols and connections. For example, symbols of a VL for class diagrams are given 
by a class represented by three rectangles, or an association represented by an 
arrow. Usually, the association arrow is connected with classes, so it begins and 
ends at a class symbol. Accordingly, the alphabet editor comprises a symbol and 
a connection editor. In order to test the defined alphabet we developed a simple 



GenGEd: a Development Environment for Visual Languages 



235 



test editor (not illustrated in Figure 1). It supports the purely constraint-based 
drawing of diagrams, not considering any VL- rules. 

Symbol Editor The symbol editor works similar to a common graphical ed- 
itor: Several primitive graphics are available, such as rectangle, circle, ellipse, 
polygon, line, etc. The user can change the form of every primitive and change 
default properties like fill and line color, style, etc. As usual, primitives can 
be translated, scaled, rotated and stretched. Not only the drawing of symbol 
graphics is supported, but also the import of bitmap graphics (GIF, JPG). 



AInhatiet Editor fjliome/mamjel/Dit)lomart)eity(ieruied/a]t)hat)ets/aass Diaqrams .alpl 




Figure2. Alphabet Editor with activated Symbol Editor. 



In contrast to a common graphical editor, the final size as well as the group- 
ing of graphical primitives in order to build up a symbol graphic is done via 
constraints. So, for example, the three rectangles of a class symbol must be 
connected by constraints as shown in Figure 2. A constraint menu is available 
where suitable constraints can be selected (see Figure 3). Constraints are used 
automatically to connect certain attachment points to a primitive. Such points 
are useful for defining connections within the connection editor. Internally they 
are treated like normal primitives. 

Datatypes like strings, integers or lists of strings are predefined symbols. They 
are implemented by predefined components. A corresponding choice is given as 
for common primitives. For each datatype the predefined default value can be 
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Figures. Constraint dialog to apply constraints to primitives. 



changed by the user. In addition to the properties of primitives, visual datatype 
properties like color or font can be set by the user. 

For each symbol a unique name must be defined which is interpreted as the 
symbol type. The symbol graphic, comprising primitives and constraints, defines 
the default layout which is used for instances. The same is true for datatype sym- 
bols where the default layout is given by the default value and visual properties. 

Connection Editor Connections between symbols can be defined using the 
connection editor. On the logical level each connection is represented by a unary 
operation (connection type). Accordingly, there is a set of constraints for the 
layout level. So for every connection a source and target symbol must be selected 
from the structure view and suitable constraints must be defined using the same 
mechanisms as are available within the symbol editor. 

Connections can be defined by considering these semantical aspects of a VL. 
Consider for example the begin and end connection between an association and 
a class symbol. On the logical level, for both connections or operations defining 
the association symbol as the source and the class as the target requires the 
existence of the classes at the time the association is to be inserted. This follows 
from the underlying formalism of attributed graph structures [BTMS99]. 

High Level Constraints Within the alphabet editor graphical constraints are 
used, on the one hand to connect primitive graphics in order to build up symbol 
graphics and on the other hand to define the connections. The constraint solving 
is done by ParCon [Gri96], a constraint solver for so-called low level eonstraints 
(LLCs). Such LLCs are not very user-friendly which is the reason why we provide 
high level constraints (HLCs) whose names and parameters are shown in the 
constraint dialog (see Figure 3). Possible types for parameters can be declared 
by object-oriented mechanisms, like inheritance. 

For the declaration of HLCs we developed a textual definition language sup- 
porting mathematical functions, intervals, overloading of constraints, use of “sub- 
constraints” , casting of types, support of parameter sets, etc. [Sch99]. Internally 
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a HLC is mapped onto arbitrarily many LLCs, which are processed by ParCon. 
A LEG can be one of the following forms: normal linear formula, product for- 
mula, formula to set a point-to-point distance, or a formula to set the equality 
of two points. In every LEG we can use the parameters of a HLG as well as 
local variables. For example, the following simple HLG sets a minimal length for 
a line and uses a local variable, a linear formula as well as an expression for a 
point-to-point distance: 

constraint MinLength(Line 1, ConstValue minLength) { 

Value length; # local variable 

length >= minLength; # linear formula 

l.s R= length * l.t; # distance between start and end point 

} 



Many HLGs are predefined in the HLG database which can be extended by 
a user’s own HLGs. When HLGs are applied by using the constraint dialog, 
consistency checks are done by the constraint component of GENGEoto avoid 
contradictory constraints. 

3 Grammar and Diagram Editor 

The grammar of a VL comprises a start diagram and a set of VL-rules. Each VL- 
rule consists of a left hand side (LHS), a right hand side (RHS) and an optional 
set of negative application conditions (NAGs). NAGs describe certain diagram 
constellations which must not exist before a rule is to be applied. Both LHS and 
RHS, and also each NAG are themselves diagrams. The elements of a diagram 
(symbols and connections) are instances of the corresponding templates in the 
VL-alphabet. 

The grammar editor is based on a given VL-alphabet. It uses, for example, 
the defined symbols and connections to generate a set of edit commands, called 
alphabet rules. Alphabet rules control the insertion and deletion of symbols and 
connections in the corresponding diagram. Such diagrams are the start diagram 
as well as the LHS, RHS, and NAGs of the desired VL-rules. According to the 
alphabet editor, the names of the alphabet rules are illustrated in the structural 
view of the grammar editor. They can be selected for display in the corresponding 
drawing area (see Fig. 4). 

A match must be defined between the alphabet rule’s LHS and the current 
diagram when a rule is applied. Symbols can be selected for a match but not 
connections which are represented by constraints. These are mapped implicitely. 
It is possible to add and remove mappings to and from a match and to complete 
the match. During a transformation step, first the logical level of the diagram 
is taken into account and then the layout level, where graphical constraints are 
solved. The transformation of the logical level is done by the integrated graph 
transformation machine Agg. After transformation, the structural correctness 
of the resulting diagram is checked. This means, that some side effects may 
occur after transformation, because of the mapping from the diagram’s graph 
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Figured. Grammar Editor. 



structure to a simple graph. Therefore, we have to make sure that for every 
symbol all outgoing connections with the appropriate target symbols exist. If the 
resulting diagram is not correct, the transformation is cancelled and the user gets 
a corresponding message. The layout level of the resulting transformation is built 
up according to the logical level: For each added (deleted) element a graphical 
object or constraint set respectively, is added (deleted). After constraint solving 
via ParCon the layout level is displayed. 

Once you finished constructing the diagrams of a VL-rule (LHS, RHS, NACs), 
the mappings between their symbols can be defined in the same way as matches 
for rule application are defined. For example, it is possible to add and remove 
mappings to and from the rule, even to complete the rule or NAG mappings. 

Datatypes and Rule Parameters Datatypes are predefined symbols which 
are defined through algebraic operations and constants. A “list of strings” 
datatype, for example, can be defined by the empty list as a constant and oper- 
ations like first or add. Default datatype values set in the symbol editor are 
constants. When defining a VL-rule these constants can be changed by variables 
or complex expressions. 
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In order to make the treatment of datatypes as easy and extendable as pos- 
sible we use Java objects and Java expressions as available in the attribute 
component of Agg. The only extension to the Agg concepts lies in the graphi- 
cal representation, so datatype objects are represented by graphics, not only by 
textual expressions. Datatype objects (constants, variables or complex expres- 
sions) must be declared using a corresponding menu. On the LHS of a VL-rule 
only variables may occur as attributes of a symbol. When a rule is applied the 
variables are matched implicitely, and the values are transformed in a way given 
by the expression on the RHS. 

VL-rules may additionally have rule parameters for datatype values which 
can be defined textually using a corresponding menu. A rule parameter is, similar 
to a variable, given by a name and a type like x: String. Such parameters may 
already express some semantical aspects of a VL. In the case of class diagrams 
for example we could enforce that every class is inserted into a diagram with a 
user-defined class name. This name can be added by a rule parameter. Whenever 
the VL-rule, which allows for the insertion of a class, is applied, the user is forced 
to define a class name. 

Diagram Editor The user-defined VL-grammar is the input of the desired 
diagram editor allowing the syntax-directed manipulation of diagrams. The di- 
agram editor comprises one drawing area instead of the four in the grammar 
editor for defining VL-rules. However, the edit commands of the diagram edi- 
tor are the VL-rules. Matching is then done by successively asking the user to 
provide the symbols corresponding to the symbols occuring in the rule’s LHS. 
When all symbols are mapped, the match is completed by connection mappings, 
and the transformation takes place (including the check for correctness). 

Whenever a VL-rule comprises rule parameters, the user is successively asked 
to define suitable values for these parameters. These values are (graphically) 
placed after transformation by considering the defined connection constraint. 

4 Related Work 

Many different formalisms have been proposed for the definition of VLs [MM98]. 
In contrast to existing approaches using either algebraic techniques [DU98] or 
graph grammar approaches as presented by Cottier [Gdt87], Andries, Engels, 
Rekers, Schiirr [RS96,AER98] and Minas, Viehstaedt [MV95,Min98] our ap- 
proach uses both for defining an alphabet and a language grammar [BTMS99]. 
We use algebraic specification techniques to define graphical symbols, links and 
layout constraints in an axiomatic way. This seems to meet the definition of the 
very basic issues of a visual language in a natural way. The grammatical struc- 
ture of a language is described by graph transformation which is again a very 
natural formalism for this purpose. 

Many different tools just as formalisms have been proposed supporting the 
definition of VLs. In contrast to other approaches GenGEd allows for the ex- 
plicit definition of alphabets and grammars for VLs. In [CODL95] the vlcc- 
environment is introduced that supports the visual definition of VLs, too. A 
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symbol editor can be used to define terminal and non-terminal symbols. The 
defined symbols are then available within a production editor allowing the def- 
inition of context-free positional grammar rules. In contrast, we use algebraic 
graph grammars which are not restricted to be context-free. Nevertheless, vice 
offers not only a nice possibility to define VLs visually. Moreover, it generates 
free-hand editors for defined VLs. 

5 Conclusion 

In this paper we introduced the GENGEDenvironment supporting the visual 
definition of VLs. The VL definition comprises an alphabet and a grammar. The 
alphabet is the basis to define the grammar. The user-defined grammar is the 
input of a syntax-directed diagram editor whose edit commands are the VL-rules 
occurring in a grammar. Diagrams are represented by attributed graph structures 
which are mapped to attributed graphs. The manipulation of diagrams is done 
by the integrated graph transformation system Agg [TER99] together with the 
constraint solver ParCon [Gri96]. 

The current implementation of GenGEdE done using Java 1.1, as is the 
graph transformation system Agg. ParCon is used as a server and is imple- 
mented in Objective C. The implementation of Gen G Ed as well as some case 
studies are described in [Sch99] (alphabet and test editor) and in [Nie99] (gram- 
mar and diagram editor). GENGEDcan be used to define several VLs because 
it allows any kind of connections. The case studies comprise box-like Nassi- 
Shneiderman diagrams as well as a restricted kind of graph-like Statecharts. 
The modeling of graph-like class diagrams is sketched in [BTMS99]. Further 
case studies are in preparation. 

GenGEdE available at http : //tf s . cs . tu-berlin.de/^genged. 

The current implementation is not complete with respect to several desirable fea- 
tures. For example, parsing is not provided nor is code generation. With respect 
to industrial relevance, both features are important as well as the connection of 
several generated diagram editors. These are the main topics for future work. 
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Abstract. This paper deals with computer aided design in architecture. 
A two-phase representation of objects is used that separates the 
definition of structure from the interpretation. It is shown how to 
integrate such a graph-based representation with the commercial 
ArchiCAD system used by many architects. Preliminary results 
reported in this paper indicate that it is possible to augment ArchiCAD 
by a graph grammar-based tool that would allow the designer to 
generate alternative solutions and to evaluate them. 



1 Introduction 

CAD tools like AutoCAD or Micro Station considerably increased efficiency of 
designer’s work by freeing him from tedious manual drawing. However, their 
contribution to creativity and conceptual ingenuity is rather negligible, not to say 
negative. The reason for such an unfortunate circumstance is that these tools operate 
on the level of detailed geometrical description of the object. Being forced to supply 
exact co-ordinates, dimensions, etc., the designer is distracted from conceptual 
thinking. This is especially annoying in architectural design, where many conceptual 
sketches are thrown away before the final solution is found [1]. 

Bearing that in mind, we propose a graph-based representation of the domain 
knowledge in Civil Engineering that consists of two levels. At the upper level the 
designer describes an abstract structure of the considered object. At the lower level all 
details required for the visualisation are introduced. 

Graphs reflecting internal structure of the object can be generated either by the 
Composite Representation System (CRS) [2], developed at the Jagiellonian 
University in Cracow, or by the PROgrammed Graph REwriting System 
(PROGRES) [3], developed at the RWTH Aachen. 

Visualisation is performed by the ArchiCAD - one of the CAD-tools commonly 
used by architects. A special converter has been developed that generates 
automatically an input file for the ArchiCAD. 



M. Nagl, A. Schiirr, and M. Miinch (Eds.): AGTIVE'99, LNCS 1779, pp. 241-246, 2000. 
© Springer-Verlag Berlin Heidelberg 2000 




242 Janusz Szuba et al. 



2 Graph-Based Knowledge Representation 

Let us outline the basic principles of our approach. We restrict our consideration to a 
rather simple example of designing a kettle since the methodology remains valid for 
any object. First we encourage the designer to create a graph describing the general 
structure of an object (Fig.l.). Nodes of the graph represent components of the 
considered object. Edges represent various relations between components, like e.g., 
the adjacency. Such a graph can also be generated by a graph grammar or graph 
rewriting system 

handle nodes - represent components of the 
considered object 

edges - represent relations between 
components, like e.g. the 
adjacency 



container 




Fig. 1. Graph representing general structure of a kettle 

After the structure of the object has been defined, one has to consider its geometrical 
properties. This is accomplished by means of the so-called realisation scheme. Such a 
scheme incorporates geometric primitives, their basic transformations, as well as 
constraints imposed on the design. Below we define a particular realisation scheme 
for the graph from Fig. 1. 

2.1 Realisation Scheme: 

• geometric primitives assigned to nodes 



z 




• transformations of primitives 

primitive assigned to handle: translate (-0.8, 0, 1); 
primitive assigned to spout : rotate{A5)\ translate(0, 0, 1); 

• constraints imposed on the design 
kettle has to carry 1.5 litre of water. 
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The final outcome of the above realisation scheme is the drawing of the kettle shown 
at the top of Fig. 2. Note that all three designs of the kettle presented in that figure 
correspond to the same structural graph. They differ only by realisation schemes. 




Fig. 2. Various graph visualisations for the same graph 



3 Process of Design 

A scheme shown in Fig. 3 explains how the process of design looks like in our 
approach. We start with functional requirements. For example, if we design a house 
for a family then it should provide an area for having meals, sleeping, social activities, 
etc. Having defined functional requirements, we translate them into a structure of the 
object. Such a structure is generated by our system automatically because this system 
contains a generative component (e.g. graph grammar or rewriting system). The 
backward loop in Fig. 3 indicates that the designer can generate graphs of object 
structure so long until he founds a satisfactory solution. 

After that the designer chooses the realisation scheme and obtains the visualisation 
of the object. Here again an iterative loop is entered: If the designer is not satisfied 
with the current solution, he can modify the realisation scheme and obtain an 
alternative visualisation. Such a browsing through potential solutions can inspire the 
user of our system for innovative design. The transition from functional requirements 
to the object’s structure is described in a detailed way in [6]. 
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Fig. 3. Scheme of design proeess 

In this paper we concentrate on the passage from the object structure to the object 
visualisation. Prototype software that allows the user to define realisation schemes 
and integrates our environment with the ArchiCAD has been developed. We describe 
it briefly in the next section. 



4 Object Visualisation in ArchiCAD 

The ArchiCAD developed by the Graphisoft is a commercially available tool for 
computer-assisted design in Architecture and Civil Engineering. This package uses 
Geometric Description Language (GDL) as a format for describing artefacts. Files 
written in that language can be imported into the ArchiCAD that converts them into 
drawings. 

In order to facilitate integration of our prototype tools with ArchiCAD, a special 
transition module was built. This module includes an editor allowing the user to 
define and to modify realisation schemes for a given graph representing the structure 
of the designed object. Such a graph is an input to the converter. The output is a GDL- 
file that can be transmitted to the ArchiCAD. 

Nodes of the input graph represent rooms of the house and edges represent 
accessibility conditions. For example, an edge between the nodes antechamber and 
hall 1 in Fig. 4 means that these two rooms share a wall. Doors and windows are 
described as subelements of walls. 
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^ shower 

^ tenace 



/|\ 




room 1 O 



room 





room 3 



^ bath 




kitchen 



^ antechamber 



Fig. 4. Structure graph of a single-story family house and its visualisation 



When we choose a realisation scheme, we must define geometric characteristics for 
every node of the structure graph. A generic room is characterised by the area of its 
floor and by the ratio length to width. Its local reference frame is based upon 
orthogonal directions NS (North-South) and WE (West-East). The program tries to fit 
the room into the given contour of the house by translating and rotating this reference 
system. At present geometric conflicts are resolved manually by user. It is planned to 
implement constraint propagation technique in the future. 

manager’s ante- 

room WS shower chamber kitchen shower 



After the realisation scheme has been defined, the program generates the GDL-file 
that contains 3D description of the house. Figs. 4 and 5 present two examples of using 
Graph-GDL Converter. The first one represents a single- family house and the second 
one a hostel. Note that by applying simple and intuitive operations to graphs the user 
obtains immediately alternative floor layouts. For example, one could try merging 
halls in Fig. 4 or establishing direct accessibility of the manager’s room ftom the 
antechamber in Fig. 5. Moreover, typical layouts can be generated by properly 
constructed graph grammars as shown in [6]. Thus, the proposed tool enhances 
sketching alternative layouts by hand by providing the architect with immediate 3D 
visualisation of the object. 




room 1 room 2 room 3 room 4 room 5 room 6 



Fig. 5. The graph of a hostel and its visualisation 
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5 Composite Representation System and PROGRES 

In our research we used two generative systems: Composite Representation System 
(CRS) and PROgrammed Graph RE writing System (PROGRES). The CRS-software 
allows the designer to build a graph grammar with a control diagram [2]. The latter 
determines an order of applying productions. The structure graph, the grammar and 
the control diagram constitute a complete set of tools for generating design solutions. 
Alternatively, one can use the PROGRES [3] — a well-known graph rewriting system. 
Both systems were applied to graph generation according to the methodology 
described in Section 2. 

6 Summary 

This paper shows that generative systems can be efficiently used in designing 
buildings. The prototype software demonstrates that it is possible to augment the 
ArchiCAD by a graph-based tool that allows the designer to generate alternative 
solution and to evaluate them. 

The proposed scheme inspires creative thinking. The user is relieved from 
considering geometrical details at the conceptual phase of design. After the principal 
structure of the building is coded in the graph, numerous alternatives can be 
investigated by merely modifying the realisation scheme. Preliminary evaluation of 
the tool done by architects turned out to be positive. However, still much has to be 
done before the proposed representation scheme will be ready for commercial 
applications. 
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Abstract. In this overview- paper, a specification method based on cou- 
pled graph grammars is sketched that is able to uniformly describe legal 
domain configurations which distributed modeling as well as re-engineering 
tasks are based on. The specification method supports the development of 
proper (re) design tools, and the method is tool-supported itself. 



1 Introduction 

In this section, two well-known approaches to distributed modeling and to re- 
engineering of legaey systems are roughly sketched as a reminder for the reader. 
This is assumed as helpful for understanding some common characteristics of 
both which can be uniformly represented in an extended graph-grammatical 
framework — which is the topic of this paper. (Following a “deductive path” , 
the argumentation of this paper proceeds from some general experiences to the 
specific contribution.) 

1.1 Re-Engineering 

In her contributions [1][2][3], Cremer describes how legacy COBOL systems can 
be transduced into an object-oriented architecture serving as a basis for re- 
implementation with a modern programming language, or even for distribution 
via the CORBA: In a first step of analysis, Cremer employs the meta language 
Txl for a selective parsing of the relevant structures of the legacy system. Then 
she exports the output of the Txl procedure into the specification environment 
Progres that is based on a set-oriented algorithmic graph transformation ap- 
proach. In the Progres environment, a homomorphic representation of the 
legacy system is restructured according to the paradigm of graph grammar en- 
gineering described in [21]. 

In their contributions [13] [14], Jahnke, Schafer, and Ziindorf describe the 
transformation of relational database systems into object-oriented ones. Both 
their approaches are based on graph grammar engineering, too: In [13], both the 
structures of the relational database system and the object-oriented database 
systems are modeled as graphs. Then, a non-deterministic graph transformation 
system specifies the domain transition according to the principles of [21]. As the 
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tables of the relational database systems may be translated into structures of the 
object-oriented system in many different fashions, the re-engineering tool offers 
many alternative transition operations of which the re-engineer has to choose. 
In [14], the procedure has been further evolved. 



1.2 Distributed Modeling 

The purpose of distributed modeling is to reduce the design complexity by work- 
sharing among several designers. The VIEWPOINTS approach of Finkelstein et 
al. [6] [7] [8] [9] is especially interesting because of its possibility of delaying consis- 
tency checks (and inconsistency repair) until explicit demand. Goedicke, Taent- 
zer et al have shown that Viewpoints’ original specification based on first- 
order logic can easily be substituted by a specification based on graph transfor- 
mation [5] [10]. 

1.3 Generalization 

Both in re-engineering and in distributed modeling one is concerned with the 
mutual consistency of two or more domains. From this point of view, the main 
differences between those software engineering tasks are intentional differences 
reflected by the dimension of time t: In re-engineering, some system A (maybe 
a structure, a program, a document, etc.) is living in domain Da from t-x until 
to while some corresponding system B is living in domain Db (whereby not 
necess. Da ^ Db) from t-y until tz such that A and B are consistent with each 
other at to (with t-x < t-y < to < t^). In distributed modeling, all involved 
systems , . . . , An evolving in their domains Dai , • • • , from t± to tj need 
to be consistent (or, in weaker approaches: partially consistent) at certain times 
t±_ < . . . < tx < tz < . . . < tj . Thus, by abstraction from those details, one may 
look at models or domains as graphs, and, consequently, one may look at the 
task of consistency management in re-engineering or in distributed modeling as 
graph transformation. Legal domain configurations can be described as sentences 
(or sentence forms) generated by special kinds of coupled graph grammars, as 
further explained in the following sections. 

2 Graph Grammar Specifications 

The following section sketches how different specification intentions may lead to 
different classes of specifications. Then, some related work is briefly discussed. 



2.1 Tool oriented View — Domain oriented View 

In many graph-grammatical approaches to the solution of practical software- 
engineering problems, a directly tool-oriented position is taken. In tool-oriented 
specification approaches, the employed graph grammars are viewed as transfor- 
mative entities that describe the operations of a tool T on its objects within an 
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only implicitly given object-domain Dt^ Assumed that an object O is a member 
of Dt^ the tool-oriented specification ensures that a T-transformed object 
(thus: O O^) is a member of Dt as well. On the other hand, in a domain- 
oriented specification approach the transformative behavior of a Tool T is not 
regarded with first priority. Hence, the employed graph grammars are viewed as 
generative entities in such an approach that enumerates the relevant domains 
Dt themselves} As a consequence it is possible to ask if a given object O is a 
member of Dt^ 



2.2 Related Work 

The already mentioned contributions [1] [2] [13] [14] follow the paradigm of graph 
grammar engineering presented in a paper by Schiirr, Winter, and Zundorf [21]. 
According to this paradigm, one graph schema is used as a kind of hierarchic type 
declaration for one graph grammar (or, more general: one graph transformation 
system). Therefore, it seems quite difficult to meta- model the distributed mod- 
eling (dealing with many domains) within the paradigm of [21]. (Consequently, 
a generalization of this paradigm will be sketched below, wherein many graph 
schemas and many graph grammars can be involved.) 

A ViEwPoiNTS-oriented approach has been taken by the authors of [5] (also 
mentioned above already). Unlike the approach of [21], they do not employ a 
sophisticated graph schema for the purpose of typing, but they explicitly mention 
the different views of a distributed modeling environment. In order to represent 
those within a graph- grammatical formalism, the authors introduce the notion 
of partial grammars (or sub-grammars) being parts of a whole graph grammar. 
While the sub-grammars characterize the evolution of the particular domains in 
distributed modeling, the whole graph grammar represents the possible history 
of the total modeling environment. (Thus, the question may arise now if it is 
possible to have both the advantages of typing with graph schemas and the 
explicit recognition of more than one domain.) 

As already suggested more than twenty years ago by Pratt [18] [19], it is 
possible to glue two or more graph grammars together in a parallel fashion. 
Doing so, one can describe the construction of consistent configuration of sub- 
models in a distributed environment quite intuitively. Given n graph grammars 
with their graph languages £i, ... jC^ representing the particular sub- model 
spaces, and given a proper coupling specification £, the language £e -G 
X ... X Cn represents the space of all consistent model configurations. (This 
approach has been followed by several authors [15], and a generalization of it is 
sketched below.) 



^ Both approaches can be combined for best specification results. 

^ The membership problem of grammars is undecidable in general, of course, but for- 
tunately there is a large class of useful graph grammars whose membership problem 
is decidable, as proven by Rekers and Schiirr in [20]. 
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3 Coupled Graph Schemas — Coupled Graph Grammars 

In [11] it is described how several software development environments (which are 
comprehensively explained in [15]) have been specified with eoupled graph gram- 
mars according to the ideas of [18] [19]. The aim of this section is to argue that 
the useful design teehnique of graph grammar coupling can be further improved 
by applying a corresponding design technique of eoupled graph sehemas. After 
the concepts are sketched in general, a small example is given below. 



3.1 Concepts 

In his dissertation [12] (embedded in the research context of [17]), the author 
describes how coupled graph grammars can be used to specify the integration of 
different yet mutually dependent sub-models in a CAD environment for chemical 
engineering. Thereby, the coupling of graph grammars is guided by the coupling 
of graph sehemas representing some structural constraints on the visual design 
language generated by the coupled graph grammars. Formally defined in [12], the 
specification approach sketched in this section generalizes the graph grammar 
engineering approach of [21] in two directions: First, the concept of hierarchic 
typing and super-typing of graph nodes has been transferred to the edges as 
well. Second, the relation between one graph schema and one graph grammar 
has been extended to a relation between n graph schemas and n graph grammars 
for the sake of inter-domain consistency descriptions. 



(xl) 




PIPE TANK 



(x2) 




LIQ inUse FLOW 



Fig. 1 eoupled graph sehemas {n = 2) 



In Fig.l, a coupling of two graph schemas Ss and 5c is shown. Arrows 
denote that some ground types PIPE and TANK are super-typed by some 
type {xi) in 5^, whereas some ground-types LIQ, inUse, and FLOW are super- 
typed by some (^ 2 ) in Sc- The thick solid lines declare that the concepts PIPE 
and FLOW as well as TANK and inUse correspond to each other. However, any 
correspondences between Ss and the LIQ concept are declared as forbidden which 
is denoted by the broken line between LIQ and the super-type xl. (Like in the 
Progres system [21], the ground types can be instantiated while the super 
types cannot: they only serve as abstract property descriptors.) 
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In a similar way, the necessary correspondences between the rules of different 
graph grammars — generating the different domains of some distributed model- 
ing task — can be declared. It is worth mentioning that the type-correctness of 
such graph grammar couplings can be checked via those axiomatic graph schema 
correspondences as they are sketched in the figure of above. 

3.2 Example 

Please imagine an engineering task wherein a system of tanks and pipelines 
shall be modelled in correspondence — but not immediately together — with 
some liquid materials flowing through the system. A graph grammar S shall 
describe the possible structure of such a system whereas a graph grammar C 
shall describe some properties of the liquid contents of the system. 
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1 ""I 


1 MIMrx 





















Fig. 2 rules of graph grammar S 



Fig. 2 shows two rules of S. With 5.ri, new tanks can be added to the system. 
With S.r2^ new pipelines can be added accordingly. (Please forget about the 
several possibilities of graph transformation semantics at this point of discourse.) 
On the other hand, let’s assume that new liquid materials shall be introduced 
and their applications shall be handled. As depicted in Fig. 3, this is done by the 
rules C.ri, C.r2^ and C.rS of graph grammar C. 



C.rl 



C.r2 





Fig. 3 rules of graph grammar C 



Supposed now that a coupling &cSsxSc of graph schemas is given (Fig.l) 
such that {PIPE — FLOW} and (TANK — in Use} are declared as the only positive 
corresponding concepts of this example. Then it obvious that applications of rule 
C.rl must be completely independent from the other model domain described 
by graph grammar S. Moreover, it can be concluded automatically that no rule 









252 Stefan Gruner 



correspondences {S.rl — C.rS} or {S.r2 — C.r2} may be declared. Instead, a tool 
may suggest its user to establish the rule couplings {S.rl — C.r2} and {S.r2 — 
C.rS} in order to characterize the integration of both involved model domains 
Ds and Dc- 

When a rule coupling is declared as a whole (by rule names), the necessary 
detailed couplings between the nodes and edges of the involved rule bodies have 
to be declared accordingly by use of the information contained in 6. Due to lack 
of space in this overview-paper, the reader must be referred to [11] [12] at that 
point of discourse. 



4 Results 

The combined specihcation method of coupled graph schemas and coupled graph 
grammars is formally sound such that it can be implemented by a specihcation 
tool. The tool can support the domain specihcation tasks in the requirements 
engineering phase of a distributed project. Given such an integrated domain 
specihcation, further tools can be built for support the constructive design within 
the project domains dehned by a coupled graph schema and graph grammar 
specihcation. 

4.1 Tool Support for Integrated Domain Specifications 

In [12] a prototype is reported which supports the proper declaration of cou- 
pled graph schemas and coupled graph grammars as sketched in the previous 
section. With that prototype, the user can construct schema correspondences 
and grammar correspondences in a syntax- directed fashion.* The tool is able 
to check the consistency of the grammar correspondences with respect to the 
given schema correspondences. It could be further enhanced by an interactive 
suggestion-component which provides the user with hints on possible couplings 
of rules. The coupling of graph schemas and graph grammars is not restricted to 
two dimensions. Instead, an n-ary approach (© C x . . . x Sn) is supported. 



4.2 Tool Support for Distributed Modeling 

In the research project reported in [17], the possibilities of tool support for 
distributed modeling in chemical engineering are studied. Important problems 
occurring in this field seem to be quite similar to the tasks mentioned in the intro- 
ductory section above — and are, therefore, open to graph-grammatical solutions 
[4]. In the context of [17], another small prototype for experimental purposes 
is described in [12]. That prototype implements a partial graph-grammatical 
parse-and-generate approach to consistency of certain chemical-engineering do- 
mains. With that tool, the user can construct simple structure views of chemical 

* The software system of the tool re-uses of some source code of the Progres envi- 
ronment [21] according to the framework-method reported in [15]. 
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plants, which can be partially translated in corresponding contents views af- 
terwards — thus: the functionality of that tool is inspired by certain aspects 
of the already mentioned VIEWPOINTS paradigm. The parse-and-generate ap- 
proach results from the coupled specification method and has been transferred 
into the Progres environment [21] from which the prototype tool has been 
generated. The correspondence information residing in the underlying coupled 
domain specification is kept by an intermediate parsing graph which occurs as 
the tool proceeds with its integrative operations from one domain the other one. 



4.3 Conclusion 

• In this overview-paper it has been argued that coupling of graph grammars 
is able to serve as a sound and uniform method for describing and under- 
standing important document-processing tasks like software re-engineering 
or distributed modeling. 

• Provided with the notion of graph schemas, the coupling of graph grammars 
can be pre-specified by the coupling of graph schemas such that the relative 
consistency of the coupled grammar specification can be checked with respect 
to the coupled schema pre-specification. 

• As the method is formally sound, tool support for the construction of coupled 
graph grammars — which can be applied in various domains and contexts of 
industrially relevant document- design tasks — is possible. 
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Abstract. In the publishing process, readers, publishers and writers can 
proht from structured documents. Adding explicit structural information 
to a document is currently so costly that it is rarely done. We show an 
application of graph technology which supports authors by offering a 
possibility to model the concepts to be discussed and their relationships. 
4Fese semantical structures can then be serialized into multiple ordered 
hierarchies which provide a framework for formulating the document 
content. Either of the structures can be edited with the other being 
kept consistent. Publishers can reduce their copy editing and cross-media 
publishing cost by using this information. 

We explain the implementation of these operations as a executable graph 
production system, that is productions as well as the underlying graph 
schemas. 



1 Introduction 

In today’s publication process the author transforms the ideas he wishes to trans- 
port to the reader into linear text. The publisher then invests substantial manual 
labor to add semantical markup and bibliographical data to typeset and struc- 
ture the document. The structured document is then printed and distributed to 
the reader who has to infer the intentions of the author from the printed text. 

Additional structural information can enhance the information flow between 
author and reader and, if already created by the author, lower the costs for the 
publisher. To motivate the authors to provide this information, we leverage the 
model of the document’s semantical structure to help the author produce better 
documents faster. 

We suggest graph-based tools for modeling the intended semantical structure 
of a text during document construction. The author uses these tools to create 
the semantical model stored as a graph. 

4Tis work has been funded by the Deutsche Eorschungsgemeinschaft (DEG) in 
its “Schwerpunktprogramm V3D2” (Verteilte Verarbeitung und Vermittlung dig- 
italer Dokumente, Distributed Processing and Exchange of Digital Documents), 

http://www.cg.cs.tu-bs.de/dfgspp.VVVDD. 

The research described here is part of an industrial cooperation with Springer- Ver lag 
dealing with parameterized production of multimedia books. 



M. Nagl, A. Schiirr, and M. Miinch (Eds.): AGTIVE’99, LNCS 1779, pp. 255-262, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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2 Authoring Support 

A first step in the authoring process before writing is to determine which informa- 
tion should be transported. It is difficult for the author to give clear boundaries 
of the thematic space addressed in the document being created. Given a main 
focus the author must decide, guided by the addressed readers, what additional 
subjects need to be described in the document. By assuming some given knowl- 
edge, he designs introductional sections that guide the reader from his context to 
the document’s main context. Additional topics present benefits of the author’s 
point of view. 

While writing, the author realizes additional subjects and themes that must 
be incorporated in the document. He may even need to change the overall struc- 
ture, resulting in restructuring, additional proofreading and often inconsistent 
documents. 

For these decisions, an overview of the the semantical network constituted 
by the potential topics is useful. It is, however, often far too complex to be 
kept in mind. If multiple authors write one document they need to interchange 
their semantical networks. Therefore, a notation to write down this network is 
needed. As a description of general cognitive structures, this notation cannot be 
very restrictive. We model it as terms and relations between these terms forming 
some arbitrary graph which we call idea net. 

The next step is the conversion of that idea net into an ordered hierarchi- 
cal form presentable as text. Today’s texts are organized as hierarchies (parts, 
chapters) with a given in-order walkthrough (chapter 1 headline, text for chap- 
ter 1, section 1.1 etc.). We call this mapping from free graph form to the more 
restricted ordered hierarchy serialization. 

When doing the serialization, a thread of argumentation is created for the 
reader. This thread will follow a path through the idea net, converting relations 
to arguments. Since ideas and threads are separated, the author may explore 
and compare differing serializations all linked to the same semantical ground. 

The serialization is finally linked to an external document of an authoring 
system the author is accustomed to. This way, he doesn’t have to accept a new 
application for the old tasks. 




Fig. 1. Overview of the conceptual components 



These views (illustrated in fig. 1) on the document and its semantics must be 
kept consistent to be useful, without the user having to take care of it. Changes 
in the idea net are propagated to the serialization created from that idea net. 



Improving the Publication Chain through High-Level Authoring Support 257 




Fig. 2. Idea net drawn in the creation of this paper. 



Edits in the serialization are undertaken in the external document, too. This also 
works the other way round: If the author deletes a paragraph in the external 
document, that change can affect the idea net. 

We call this collection of features High Level Authoring Support. We have 
implemented a prototype semantical back-end to create and edit an idea net 
and an integrated serialization using progres [10]. Graphs are the natural data 
structure for interconnected networks, and using graph productions to manipu- 
late them allows us to quickly create and change commands and to experiment 
with the environment in this early stage of development. We extended the Tool- 
Book (www. asymetrix. com) authoring environment to send change messages for 
(up to) all user actions [8] to the prototype. The UPGRADE framework [9] for 
generated prototypes is being extended to accept commands not only from the 
user interface, but also through inter-process communication. 



2.1 Idea Net Support 

Our prototype supports a model of terms and their relations stored as a graph. A 
snapshot of the idea net we drew before we wrote this paper is shown in figure 2. 
The figure shows a screenshot of the available prototype. The bright boxes are 
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Terms, the gray ones relationships between them. The numbers can be ignored 
here. 

Figure 3 shows part of the PRO GRES schema defining the content model 
declared in the abovementioned specification. It uses a UML notation with node 
classes and sections mapped to rectangles and folders; edge types to associations. 



Content Scheme 



Subject 


SEMANTICAL 

5 - 


^[1:1] Gathers 




[0:n] /A 




^ \ 

\ [l:n] 




STATEMENT 


^ [0 :n] 

[0 :n] Object 


TERM 


CLUSTER 


^ 

REQUIREMENT 







Fig. 3. Core of the schema of the idea net content model. 



Every class in this schema denotes some semantical concept, so there is a com- 
mon abstract root class Semantical. The basic semantical unit is a Term. Clusters 
collect by the Gathers-relation semantical entities so they can be addressed to- 
gether. There are no further semantics defined for this part-of relation. 

Statements express relations between semantical units as Subject-Predicate- 
Object constructs. A Requirement is a special Statement: the author claims that 
understanding of the subject requires knowledge about the object. Any serial- 
ization should therefore place the object of a requirement before the subject. 

Tool support is limited to the syntactical level. Basic editing commands exist 
to handle Terms and Statements and to group them in Clusters. Euture work may 
produce operations to automatically propose clustering or to collapse and expand 
clusters. 

The simple notation is intended to help the author to organize his thoughts 
and to interchange them with other people. Computer support in this early stage 
provides basic benefits like easy editing and storing. It is also the basis for further 
authoring support. 



2.2 Serialization 

The core of the graph schema of the serialization is shown in figure 4. It im- 
plements a simple ordered hierarchy that suffices to implement the envisaged 
serialization support. Serialized entities can either be Atomic or complex Divi- 
sions. The top-level Division is marked as a Work. The content model and the 
serialization model are connected by Discusses edges from Serialized to Semantical 
nodes. 

Graph productions create and maintain serialized forms of idea net struc- 
tures. These productions also manage the Discusses interconnections. 
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Fig. 4. Core of the schema of the serialization part. 



One basic rule for the serialization of a statement is to introduce the object 
before the subject. To describe IPSEN (cf. the left of figure 2) and its benefits, the 
advantages of “Integration”, “Generation”, and “Checks” should be established 
first. However, whether the statement should be described before or after the 
description of its parts depends on the subject. We distinguish these alternatives 
as inductive (parts first) and deductive (association first). 

We discuss the initial inductive serialization of a statement as an example. 
Subjects and objects of that statement are serialized in arbitrary order. ^ The 
starting nodes of these sequences are passed to the production SerializeJnduc- 
tiveCore shown in figure 5. It ensures that the discussion of objects precedes 
the discussion of subjects and, more important, appends the discussion of the 
statement to the subject sequence. 

This serialization gives a first impression of how an ordering of the semantical 
terms will ‘read like’. The simple rules described above will not suffice to serialize 
complex idea nets, others have to be found and implemented. Furthermore, when 
writing text for some serialization nodes, new relations will come to the author’s 
mind. He integrates these thoughts into the idea net or the serialization of the 
concepts, with the tools keeping the other documents up-to-date. 

The author therefore can directly edit the serialization as in conventional 
tools. Using high-level commands, he can also transform inductive to deductive 
expositions with little effort. 

2.3 Dependency Analysis 

Once an idea network and some serialization are set up, they are available for 
evaluation, such as automated quality assurance. In our prototype, tests are 
implemented to warn the author if he violates certain rules, creating conflict 
markers for the violations. One example is broken link surveillance: For two ideas 
handled in successive serialization units, an “inferred dependency” is recorded. 
If the adjacency is broken up, the dependency is flagged to remind the user of 
potential implicit references. 

The author benefits from these markers in two ways. First he can work 
through these markers and decide for each marker whether that rule violation 

^ Currently, other statements about these terms do not influence this serialization. 
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safe production Serialize_InductiveCore ( statement : STATEMENT ; f irstObj ectSerialized, 

f irstSubj ectSerialized : SERIALIZED [1:1]; 
out newTopLevelDivision : DIVISION) = 





folding { ' 6 , '7 } ; { ' 4 , '5 } ; 

return newTopLevelDivision := 8' 
end ; 



Fig. 5. Graph production to create a first serialization of a statement. 



should be ignored or he can fix it by reordering conflicting serialization nodes. If 
the idea net contains dependency cycles, and most idea nets for nontrivial sub- 
jects do, conflicts with the inherent rules are unavoidable. The warning makes 
sure the author realizes the conflict and accepts its consequences. 

The second benefit is interactive support for alternative text serializations. 
The author can simply derive a different order to get informed about violated de- 
pendencies. He will choose an order that is a compromise between the bottom up 
approach supported by the graph transformation rules and an order determined 
by the targeted audience and grouping of related terms. 



2.4 Integration with Other Applications 

While the semantical graph can help an author to structure his work, he still 
needs to write the text itself, for which he’ll prefer his customary tools. Thus, the 
semantical graph editor integrates with conventional authoring environments. 

Today’s integration technology allows for a- posteriori integration of such 
tools. We use ToolBook’s open messaging mechanism to listen to editing ac- 
tions as well as to enact changes from the serialization. Some actions from the 
semantical back-end may also reasonably be invoked from the authoring appli- 
cation through additional commands or menu entries. 
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3 Related Work 

Our idea net resembles the Dexter Hypertext Reference Model’s [5,4] storage 
layer, which is separated from run-time, and within-component layer. Our seri- 
alization structure can be seen as a further connection layer between semantics 
and presentation. It defines how components will be ordered, but not what they 
will look like. Another valid perspective sees the graph structure document as 
an additional link layer which links nodes stored in the external authoring ap- 
plication’s document, which is then the storage layer. The first perspective does 
not satisfactorily explain the storage of the content, the second disregards the 
added presentation information in the external document. 

Another way to model hypermedia systems is the “Object Oriented Hyper- 
media Design Model” (OOHDM) [11]. It uses software engineering technologies 
to model the multimedia system. This leads to an implementation oriented view 
while our approach emphasizes on the contents. 

The idea of letting the author edit the conceptual structure of the topic 
area and derive expositions from it has been examined to varying extents in a 
number of projects. [3] mentions the UNO Write Environment [12], which also 
interconnects a graphical and a hierarchical view. It does not address high-level 
support or integrating with existing authoring environments. 

Based on the UNO WE further research at the GMD produced the Hyper- 
StorM hypermedia engine [2]. Users work on private and shared documents to 
cooperatively develop documents or other products. The database provides for 
hierarchical objects and allows run-time schema modifications. 

The most important differences are: (1) We use the programmed graph 
rewriting language PRO GRES to implement the operations on the semantical 
structure, progres offers more descriptive power than the HyperStorM engine. 
(2) The constraints given in HyperStorM remain on the database level. They 
cannot be used to describe aspects like dependency analysis, which are express- 
ible in PROGRES. 

The tools described in section 2 form an integration tool that mediates be- 
tween idea net and serialization graph. Within the IPSEN project, different types 
of integration tools have been studied [7, sections 1.6.3, 3.4, 4.6]. It classifies the 
serialization support described in this document as a transformation tool. 

[6] discuss a number of properties that can be calculated from conceptual 
structures by conventional graph computation. 

4 Conclusion and Plans 

The paper has described some problems currently impeding the authoring, pub- 
lishing and reading chain. Solutions, especially for the problems of the author, 
have been presented. Modeling the underlying structure as a graph allows ab- 
straction from order and containment where it is not yet defined. This additional 
information can then be incorporated, changed and experimented with until a 
document with deep structure is ready for distribution. The structure is then 
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available for cross-media publishing, adapting to readers, browsing, and query- 
ing. 

While the graph transformations presented here are of a research prototype 
nature, they will serve as the basis for providing incentive features to an author 
guidelines product developed in cooperation with Springer- Verlag and other pub- 
lishers. This work will be part of the Global-Info project, which is funded by the 
Federal Ministry of Education, Science, Research and Technology. 

Before the prototype presented here can be useful for many authors, the 
execution machinery has to be integrated with commonly used authoring envi- 
ronments. We will implement another plug-in for the integration with Microsoft 
Word as required by our project partner. Springer- Verlag. 
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Abstract. Different learning algorithms based on learning from exam- 
ples are described based on a set of graph rewrite rules. Starting from 
either a very general or a very special rule set which is modeled as graph, 
two to three basic rewrite rules are applied until a rule graph explain- 
ing all examples is reached. The rewrite rules can also be used to model 
the corresponding hypothesis space as they describe partial relations be- 
tween different rule set graphs. The possible paths, algorithms can take 
through the hypothesis space can be described as application sequences. 
dTis schema is applied to general learning algorithms as well as to fuzzy 
rule learning algorithms. 



1 Introduction 

Building models from data has started to raise increasing attention, especially 
in areas where a large amount of data is gathered automatically and manual 
analysis is not feasible anymore. Also applications where data is recorded on- 
line without a possibility for continuous analysis are demanding for automatic 
approaches. Examples include such diverse applications as the automatic moni- 
toring of patients in medicine (which requires an understanding of the underlying 
behavior), optimization of industrial processes, and also the extraction of expert 
knowledge from observations of their behavior. Techniques from diverse disci- 
plines have been developed or rediscovered recently, resulting in an increasing 
set of tools to automatically analyze data sets. Most of these tools, however, re- 
quire the user to have detailed knowledge about the tools’ underlying algorithms, 
to fully make use of their potential. In order to offer the user the possibility to 
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Fig. 1. A linguistic variable temperature with three linguistic values (described 
through fuzzy sets) eold^ warm^ and hot and the degrees of memberships for a 
certain temperature 



temperature oil-price 




Fig. 2. Rule graph for the rules given in example 1. 



explore the data, unrestricted by a specific tool’s limitations, it is necessary to 
provide easy to use, quick ways to give the user first insights. In addition the ex- 
tracted knowledge has to be presented to the user in an understandable manner, 
enabling interaction and refinement of the focus of analysis. 

Learning rules from examples is an often used approach to achieve this 
goal. Over the years different rule learning algorithms have been developed as 
e.g. [7, 9, 6, 5]. For fuzzy rules, training algorithms are described in [1,10,15,3]. 
Overviews for both fields are given in [2]. 



2 Fuzzy Rules 

Example 1. In this paper we will use an example based on fuzzy rules [16,17] 
to explain the basic concepts. First the data used will be explained as well as 
possible rules describing this data. Our examples handles the question when oil 
should be bought depending on the current weather and the current oil-price. 
Three linguistic variables are used to describe these conditions. The linguistic 
variable temperature with its linguistic values eold^ warm and hot is shown in 
Fig. 1. Similarly the linguistic variable oil-priee with its values eheap^ medium 
and high and buy -ranking with buy, aeeumulate, don^t buy are used. Rules making 
use of these variables are the following: 
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7^1 : if temperature is [cold or warm) and oil-price is cheap 

then buy-ranking is buy 

7Z2- if temperature is warm and oil-price is [cheap or medium) 
then buy-ranking is accumulate 

TZs- if oil-price is high then hoxbuy-ranking is don’t buy 

IZa'. if temperature is hot then buy-ranking is don’t buy 

Simple iF-THEN-rules together with conjunctions and disjunctions are used. 
These rules can be transformed to a rule graph as shown in Fig. 2. Each linguistic 
variable used as input and their corresponding linguistic values, each linguistic 
values describing the result as well as each rule is modeled as a node. In our run- 
ning example there are nodes for temperature, eold, warm, hot, oil-priee, eheap, 
medium, high, don^t-buy, aeeumulate, buy and nodes modeling the four rules IZi, 
IZ 2 , Tls, 7 ^ 4 . Edges connect rule nodes to their input and output nodes as well 
as linguistic variables to their possible values. Other graph based descriptions 
of rules are e.g. (Fuzzy) Petri Nets [4,13] having also the advantage that the 
execution of rules can be modeled with rewrite rules. But for the sake of clarity 
(and space) a “smaller” graphical representation was chosen here. 

In the following, different training algorithms generating rules from examples 
are described. As description language graphs as shown in Fig. 2 are used to- 
gether with simple graph rewrite rules. The special rewrite formalism used is not 
further described since it is possible without problems to use different formalisms 
as described in [14]. The next sections will deal with training algorithms starting 
bottom-up from a special graph (i.e. the most specific rules) and top-down from 
a very general graph (the most general rules). Based on rewrite rules the set of 
all possible rule graphs, the hypothesis space, can be described. 

3 Bottom Up Training 

Starting with a very special graph and generalizing it until it covers all given 
training examples is the basic idea for bottom up training. The most special rules 
for a given set of examples are the examples itselves. It is easy to transform an 
example into a rule and this rule then covers nothing else than its underlying 
example. In [15] such an algorithm was introduced. 

Example 2. For our running example we take the data given in Fig. 3 (left). Four 
possible data points are shown. For the sake of simplicity no actual numbers are 
given here. It is only indicated in what possible linguistic value this number falls. 
Transferring each of the above examples into a special rule leads to a graph as 
shown on the right hand side of Fig. 3. E.g. the data example eold - eheap - buy 
leads to the following rule: 

TZ\ if temperature is eold and oil-priee is eheap 
then buy-ranking is buy 



In Fig. 3 this rule is marked R, 
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demperature 


oil-price 


buy-ranking 


cold 


cheap 


buy 


warm 


medium 


accumulate 


warm 


high 


don’t buy 


hot 


medium 


don’t buy 



temperature oil-price 




Fig. 3. Four data points for training (left) and the corresponding rule graph 
(right) 







Fig. 4. Generalizing (left) and merging (right) in rule graphs. 



As these kind of example-based rule graphs do not generalize at all, it is useful 
to change their structure to respond also to other inputs while still classifying 
the examples correctly. Two kinds of operations can be performed to change the 
rule graph. First the input to a rule can be generalized. In this case a new edge 
is inserted pointing from the possible new input to a rule i.e. a node representing 
a linguistic value e.g. hot^ to the node representing the rule itself. This rewriting 
rule is shown in Fig. 4 (left side). A negative application condition ensuring that 
there is at most one edge between nodes is omitted. 

Example 3. To the graph in Fig. 3 this rule could be applied ten times leading 
to a graph where each of the four rules has all possible inputs. As there are only 
three output values there are two equal rules handling the output donEhuy, 
These rules can be merged with the help of the rule shown in Fig. 4 on the right. 
Merging two rules means that two identical rules are merged into one rule or 
that one of the identical rules is deleted. This leads to the graph as shown in 
Fig. 5. This graph shown in Fig. 5 is the most general possible graph which gives 
a positive output for all three output classes no matter what the input is. The 
graph given in Fig. 2 handles the given examples also correctly. It can be reached 
by applying the generalization rule once to the first and second rule node and 
twice to the third and fourth rule node starting from the graph given in Fig. 3, 
right. The merge rule is not applied at all. 
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temperature oil-price 




Fig. 5. A Rule Graph containing one Rule for each Output Class 




Fig. 6. Shrinking (left) and Committing (right) in Rule Graphs 



4 Top Down Training 

Fig. 5 serves as starting point for algorithms like [3]. In contrast to the last 
section the graph must now be specialized since most general rules are used 
which do not handle the examples correctly. In [3] two operations are suggested: 
A shrink operation minimizing the input into a rule i.e. deleting an edge pointing 
from a linguistic value to a rule node. Additionally a commit inserting a very 
general new rule, a rule that has all possible inputs. Both rules are given in 
Fig. 6. Taking Fig. 5 and the examples given in Table 3, one commit and ten 
shrink leads to the rule graph in Fig. 2 classifying the examples correctly. Of 
course this is just one possible path leading to one possible rule graph. Other 
paths and other rule graphs are possible. One of the possible final results of 
applying shrink and commit to the start graph as shown in Fig. 5 is again shown 
in Fig. 2. 

5 Organizing Models 

With the graph shown in Fig. 5 being the most general rule graph having true 
as its only possible output, it is obvious that Fig. 3 does not contain the most 
specific graph. This most specific graph is shown in Fig. 7 containing no rule 
at all. Following [12] the different rule graphs can be organized in the so called 
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temperature oil-price 




o o o 

buy accumulate don’t-buy 



Fig. 7. The bottom element of the rule graphs. 




Fig. 8. A rule is inserted taking exactly one input from each variable. 



hypothesis space between the top and the bottom element. In between these two, 
all possible rule graphs can be organized as follows: 

Starting from the bottom element, the empty graph, three operations are 
necessary to build up the complete set of rule graphs: 

— An insert- special rule as shown in Fig. 8 inserting the most special rule into 
the rule graph having one input from each variable as e.g. temperature or 
oil-price, 

— A generalize rule as shown in Fig. 9 doing nothing else than extending the 
input of a rule by inserting an edge from a variable like e.g. hot^ medium^ 
cold to a node representing a rule. Of course it must be ensured that such 
an edge does not exist already. 

— A delete- general rule as shown in Fig. 10, which is a rule that deletes the most 
general rule, a rule that has input from every possible value. Nevertheless it 
must be ensured that there is another rule covering the same output class 
otherwise it would be possible to get back to the bottom element in a cycle. 

Going the other direction from the top to the bottom element, these three 
rules can be used inversely as specialize ^ , delete- special and insert- general. Tak- 
ing for example the algorithm presented in Section 4 it only uses the specialize 
and insert- general rule named there commit and shrink. In other cases several 
of these small steps must be executed sequentially without interruption to form 
one single step the algorithm is using. For example in Section 3 a rule is inserted 

This rule is often called dropping rule in the literature [11]. 
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Fig. 9. The input to a rule is generalized by inserting a new edge between a 
node modeling a variable and a node modeling a rule. 




Fig. 10. A most general rule having input from each possible value is deleted. 
It must be ensured that there is one remaining rule for the output class. 



for each example. This bigger step consists of one insert-special and several gen- 
eralize rule applications. 

Nevertheless infinitely many rule graphs can be created by subsequently in- 
serting new rules. Semantically this means that equal rules exist in the graph. 
The number of really different rule graphs can be calculated as follows. Having n 
input classes with rui^l < i < n possible values then there are 



R 



RE 

t=i j=i 



m, 



different rules and 2^ different rule graphs for only one output class. It is with 
the specific algorithm used that one has to make sure that it actually terminates. 



6 What Can Be Done? 

With the hypothesis space defined as described the following issues can be ex- 
plored further: 

— Having a theoretically founded description method, it is straightforward to 
use it for further theory based modeling and proofing. For example termi- 
nation of training algorithms can be shown based on ideas developed for 
general rewriting. By mapping graphs onto elements of a terminating partial 
order and by showing that the application of rewrite rules to graphs only 
makes them smaller along the lines of this partial order termination can be 
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shown. E.g. the termination of the algorithm given in [3] can be proven along 
the same line as shown for a similar Neural Network algorithm in [8]. 

— The version space [12] contains all rule graphs that describe the training 
examples correctly. Assume that two graphs are in relation if they handle 
the same set of training examples correctly. This equivalence relation can be 
used to describe confluence. If an algorithm can be shown to be confluent 
with respect to this equivalence relation it always leads to a result modeling 
the training data correctly. This does not mean that always the same result 
is reached. The goal is just to show that one possible rule graph can be 
constructed ensured by the equivalence relation. Results obtained in graph 
rewriting help to proof (local) confluence. When taking the Double- P us hout 
Approach into account ([14], Chapter 1) theorems dealing with parallel in- 
dependence are useful. 

— Data can be noisy so that it is useful not to take certain examples into 
account when building the rule model. In that case the hypothesis space and 
the version space do not differ much. This leads to hierarchical spaces. It is 
an interesting question how these spaces are related. 

— A practical application tool support must be provided. In a rapid proto- 
typing system for rule generation few basic transformation rules must be 
provided together with e.g. application conditions and control flow specifica- 
tions. With the help of a visualization for a smaller set of testing examples 
it can be visualized how fast a given algorithm converges to a possible rule 
graph. 

Whereas the first two points represent actual developments, point three and four 
are future work. 
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Abstract. Theorem proving for functional programming languages can 
be made much easier by the availability of a dedicated theorem prover. 
A theorem prover is dedicated to a specific programming language when 
it fully supports the syntax and semantics of the language and offers 
specialized proving support for it. Using a dedicated theorem prover is 
easy, because one can reason about a developed program without having 
to translate it. However, no suited dedicated theorem prover for a func- 
tional language exists yet. This paper describes a simple prototype of a 
dedicated theorem prover for the functional language Clean. A descrip- 
tion of the possibilities of the prototype is given and an examination is 
made of the work that needs to be done to extend the prototype to a fully 
operational and truly useful programming tool. Also example proofs of 
some basic properties and of a graph transformation are given. 



1 Introduction 

Functional programming languages like Clean[9] and Haskellfll] are well suited 
for theorem proving. They are based on the well defined notion of term graph 
rewritingflO] and are free of side-effects. As can be seen in [2], it is very easy to 
prove simple properties of functional programs. 

Unfortunately, when programs get larger, theorem proving gets increasingly 
more difficult. Proving properties of real-life applications can take several months 
and is still only performed by teams of experts. But proving properties of small 
essential pieces of the program can be very useful as well, especially when it is 
done in an early phase. Errors in functions can be corrected before they have 
effect on other parts of the program. Once the correctness of a function has been 
established, it can be used (and re-used) safely in other parts of the application. 

Good support for theorem proving could benefit programmers. Many power- 
ful tools for theorem proving are available, like for instance Coq[l] and Isabelle[8], 
which are claimed to be well suited for functional programming languages. How- 
ever, proving properties of programs written in Clean using Coq or Isabelle 
turned out to be very difficult. They do not support the syntax or semantics 
of Clean, making it necessary to model the semantics of Clean and to translate 
the program to this model. The reasoning then takes place on the model of the 
program, instead of on the program itself. Also, the user interfaces of Coq and 
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Isabelle are primitive and do not offer much support for the interactive reasoning 
process. Commands have to be typed explicitly in some kind of syntax and it is 
difficult to find out what commands are needed to finish a proof. 

These problems can be partially overcome by building an interface on top 
of Coq or Isabelle. This interface can automatically translate programs to and 
from the theorem prover and provide a sophisticated user interface. Still, the 
semantics of Clean has to be modeled in Coq or Isabelle and this is far from 
trivial. 

But for proving simple properties only a small part of Coq or Isabelle is 
needed. This makes it feasible to implement a small theorem prover for Clean 
ourselves. This will eliminate the need for translations, since the reasoning will 
take place on the program itself. Also a dedicated theorem prover can be devel- 
oped to meet our specific goals, i.e. usable by programmers during the develop- 
ment of programs to prove simple properties fast. 

To test the effort needed to implement a small dedicated theorem prover for 
Clean a prototype has been developed. It turned out to be fairly easy to imple- 
ment a reasonably powerful theorem prover. In this paper a short description 
of the prototype will be given. First the restrictions of the prototype will be 
given. It is described how proofs can be built using the prototype. Some exam- 
ples of proven theorems and proofs are given next. Finally the extension of the 
prototype to a complete theorem prover is discussed. 

2 Restrictions of the First Prototype 

Developing a theorem prover which fully supports Clean is a lot of work. To allow 
for the rapid development of a prototype, the input language has been simplified 
a lot. First of all the graph rewriting which underlies Clean is reduced to term 
rewriting; no cycles are allowed. Secondly the lazy reduction mechanism is re- 
duced to an eager one; no infinite intermediate results are allowed and functions 
must always terminate. Thirdly partial functions are not allowed; all expressions 
must have a well defined value. Finally syntactic sugar like comprehensions, 
dot-dot expressions and local definitions are not supported. 

What is left is a very small subset of a functional language. This is not only 
a subset of Clean, but for instance of Haskell and ML as well. The results of 
the prototype can therefore be applied to other functional languages than Clean 
as well. Although the subset is very small, many interesting functions can be 
expressed in it. In the future work some issues on how the restrictions will be 
lifted are discussed. 

3 Using the Prototype to Construct Proofs 

In order to prove a theorem about a program in the prototype three things have 
to be done: (1) the program has to be expressed in the prototype (this boils 
down to removing the sugar from a simple Clean program), (2) the theorem 
has to be expressed in the prototype and (3) the proof for the theorem has to 
be constructed by supplying proving commands. In the next subsections these 
phases are described separately. 
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3.1 Specification of the Program 

Algebraic types are the only definable types in the prototype. An algebraic type 
is defined by a number of data-constructors which are able to construct elements 
of the type. The type of the booleans can for instance be defined as: 

: : Bool = True | False 

It is also possible to define higher-order types and to define types using 
recursion. In this way the type of the lists can be defined as: 

: : List a = Nil | Cons a (List a) 

The empty list Nil is usually denoted by [] and the construction of lists by 
Cons X xs is usually denoted by [x:xs] . 

Functions are defined using pattern-matching. In the left-hand-side of a pat- 
tern only applications of data-constructors on variables are allowed. The right- 
hand-side of a pattern can be any expression. An expression can either be a 
variable or an application of a function or data-constructor. Higher-order, par- 
tial and recursive applications are allowed. An example of a valid definition is: 

Map : : (a -> b) (List a) -> (List b) 

Map f [] = [] 

Map f [x:xs] = [f x: Map f xs] 

3.2 Specification of the Theorem 

Theorems about programs are basically equalities between expressions stated in 
a first-order predicate logic. The logical operators that are allowed are V, A, -i 
and — Quantifications over types and over expressions of any type are allowed. 
Examples of stated properties are: 

Va,bVf::a^b-Map f [] = [] 

VaVx::aVxs::List a-Length [xixs] = Length XS + 1 

3.3 Building a Proof 

Proofs are constructed much the same way as in most traditional theorem 
provers. First the statement to prove is specified as the current goal. This goal 
is then gradually transformed to simpler goals by the application of reasoning 
steps. This kind of reasoning is called backwards reasoning. 

The reasoning steps are called tactics. Each tactic must be sound with respect 
to the semantics of the program. A tactic may transform a goal to a logically 
equivalent or stronger one. The former tactics will be called ‘safe tactics’, the 
latter ‘risky tactics’. A ’risky tactic’ can easily lead to a proof state which can’t 
be extended to a complete proof and must therefore be handled with care. For 
this purpose the risky tactics return a list of possible outcomes, while the safe 
tactics produce only one outcome. The following safe tactics are available in the 
prototype: 
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1. Uncurry. Collects arguments of applications in sequel, for example: 

(Map +) [] Map + [] . All occurrences are rewritten at once. 

2. Simplify Step. Applies a rewrite-rule. Function patterns, lemmas, proven goals 
and the semantics of the logical operators are all represented by rewrite-rules. 
At most one rewrite is executed. 

3. UnequalConstructors . Replaces at most one equality between two different 
data-constructors by False. 

4. Split. Splits a goal P A Q in two goals P and Q. 

5. HypoStep. Creates rewrite- rules P — Q ^ True for each suitable hypothesis 
P — Q in the context and calls Simplify Step with this set of rewrite-rules. 

6. Induction. Applies standard induction on the outermost quantification. The 
appropriate induction scheme is dynamically constructed. Induction on all 
algebraic types is allowed. 

7. Introduction. Either removes the outermost quantification by adding the typ- 
ing information to the context, or transforms a goal P — Q to Q by adding P 
as a hypothesis to the context. 

The following risky tactics are available: 

8. Generalize. Substitutes a suitable subexpression by a free variable and then 
adds a quantification over it. This tactic can not be used on variables. For 
each suitable subexpression an outcome is generated. 

9. Simplify Equality. Rewrites at most one equality between applications of the 
same function by assuming that the function is injective. 

10. Generalize Variable. Adds a quantification over a free variable in the goal. 
For each free variable an outcome is generated. 

11. Unintroduce. Unintroduces a hypothesis in the context by creating an im- 
plication in the goal. The hypothesis is not removed from the context. For 
each hypothesis an outcome is created. 

12. UseE quality. Creates re write- rules P ^ Q and Q ^ P for each hypothesis P 
= Q in the context and calls Simplify Step with this set of rewrite- rules. 

All of the tactics are also present in one way or the other in traditional 
theorem provers. This small set of tactics proved to be powerful enough for the 
prototype. A more detailed description of the tactics can be found in [6]. 

3.4 Automatic Proof Construction 

The tactics in the previous subsection are the basic tactics of the prototype. An 
advantage of a dedicated theorem prover is that tactics can be composed in the 
way that is most convenient for the application domain. The prototype provides 
a composed tactic Auto for this purpose, with which automatic proof search 
specifically for simple theorems about functional programs can be modeled. 

Ideally the Auto tactic should try all possible combinations of basic tactics. 
In this way it can be ensured that as many proofs can be found automatically 
as interactively. Unfortunately this is not possible, since trying all possible com- 
binations of basic tactics simply takes too much time. Therefore the number of 
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tried combinations is reduced using a search heuristic. For this heuristic first a 
stripped version, called SafeAuto^ which is used for recursive calls, is defined. 
This tactic uses the following strategy: 

1. Apply the first safe tactic that can be applied on the current goal. Make the 
outcome the new goal and recursively call Safe Auto. Proceed to step 2 when 
no safe tactic can be applied. 

Note that induction is always tried before introduction. 

2. Apply Generalize to obtain a list of outcomes. Recursively call Safe Auto on 
each outcome. If calling Safe Auto completely solves an outcome, use this 
result and exit. Otherwise undo the application of Generalize. 

This procedure will be abbreviated as ‘multitry Generalize with SafeAuto\ 

3. Multitry SimplifyEquality with SafeAuto. 

A distinction is thus made between the safe tactics and the risky tactics. 
Applications of safe tactics can never be undone, while a form of backtracking 
is performed for risky tactics. The Auto tactic can now be described as follows: 

1. Apply SafeAuto. 

2. Multitry Generalize Variable 2 with SafeAuto. 

{Generalize Variable 2 is applying Generalize Variable twice, storing all possi- 
ble outcomes in a single list) 

3. Multitry Unintroduce2 with SafeAuto. 

4. Multitry UseEquality2 with SafeAuto. 

Prohibiting recursive calls of Generalize Variable^ Unintroduee and UseEqual- 
ity and eliminating backtracking after safe tactics makes the Auto tactic fast 
enough. Fortunately, our experiences show that little proving power is lost. 

3.5 Examples of Proven Theorems and Proofs 

The prototype has been tested using examples from the book ‘Introduction 
to Functional Programming using Haskell’ [2]. All 72 tried theorems could be 
proven. A total of 27 lemmas were introduced to facilitate the proving process. 
These lemmas were inspired by stuck proving sessions and were easily found. 
All lemmas could be proven automatically and, using the lemmas, 70 of the 72 
theorems could be proven automatically as well. A full list of proven theorems 
can be found in [5], below some examples: 

F VaVxs::List a-F^verse (Reverse xs) = xs 

2. VaVxs::List a'7^n::Nat - (Take II XS) (Drop II XS) ~ XS 

3. Vx::NatVy::NatVz::Nat-X ^ (y + z) = (x ^ y) * (x ^ z) 

4. Vx::Nat-L0g (2 ^ x) = X 

An example proof of the first theorem is shown in Table 1. An automatic 
proof attempt fails on proof state 6. Examining this state a lemma was intro- 
duced: VaVx::aVxs::List a-Feverse (xs ++ [x] ) = [x: Reverse xs] . This lemma 
was proven automatically and using the lemma the proof can be completed au- 
tomatically. 
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1. Introduce a 

Vxs::List a-Rsverse (ReversG xs) = xs 

2. induction xs 

Reverse (Reverse [] ) = [] IB 

3. Simplify With "Pattern Match Rule [Reverse []]" 

Reverse [] = [] 

4. Simplify With "Pattern Match Rule [Reverse []]" 

□ = [] 

5. Simplify With "Rule [x = x]" 

Reverse (Reverse xs) = xs ^ 

Reverse (Reverse [x:xs]) = [x:xs] Iff 

6. Simplify With "Pattern Match Rule [Reverse [x:y]]" 

Reverse (Reverse xs) = xs ^ 

Reverse ((Reverse xs) ++ [x] ) = [x:xs] 

7. Simplify With "Lemma [Reverse (xs ++ [x] ) = [xiReverse xs]]" 

Reverse (Reverse xs) = xs ^ [x: Reverse (Reverse xs)] = [x:xs] 

8. Simplify With "Rule [[x:xs] = [y:ys]]" 

Reverse (Reverse xs) = xs ^ x = x A Reverse (Reverse xs) = xs 

9. Simplify With "Rule [x = x]" 

Reverse (Reverse xs) = xs ^ True A Reverse (Reverse xs) = xs 

10. Simplify With "Rule [True A P]" 

Reverse (Reverse xs) = xs ^ Reverse (Reverse xs) = xs 

11 . Simplify With "Rule [P ^ P]" 

True 



Table 1. An example proof of VaVxs::List a-Reverse (Reverse xs) = xs 



3.6 Upgrading the Prototype: Further Work 

The prototype is a very small theorem prover. A lot of work needs to be done 
to obtain a fully operational dedicated theorem prover for Clean: 

T Support for full syntax. This can easily be accomplished by re-using the 
existing parser for Clean. By invoking the compiler one can even get a simpler 
(internal) representation of a program written in Clean as well. 

2. Support for full semantics. The semantics has to be extended with laziness, 
partial functions and graphs. A large part can be accomplished by imple- 
menting lazy graph-rewriting in the theorem prover. 

3. Tactics for infinite structures. Infinite structures require different proving 
techniques, like for instance co- induction (see example in next subsection). 
These techniques are however much more difficult to use than ordinary tech- 
niques like structural induction. Therefore it may be necessary in some cases 
to prove correctness provided no infinite structures occur. 

4. Support for the standard library. Many functions in the standard library in 
Clean are inlined: they are implemented in machine code. The semantics of 
these functions has to be modeled. 
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Once this has been achieved, a step further can be taken. The dedicated 
theorem prover can be integrated in Clean by providing links with the existing 
development tools for Clean. For example, a link between the editor and the 
theorem prover can be made. Integration in a programming language can greatly 
enhance the user-friendliness of a theorem prover. 

An integrated theorem prover can easily be used to show the correctness of 
safety critical applications. Also, programs can be annotated with proven logical 
statements which describe the behavior of components of the application. 



3.7 Example Proof with Graphs, Cycle Unfolding and Co-induction 

Suppose one wants to prove Iterate id 1 = Ones(l) using 

Ones = [l:0nes] id x = x 

Iterate f x = xs where xs = [x: Map f xs] 

Start by expanding the definitions of Iterate and Map once (2): 






Now remove the Cons 1 start-nodes on both graphs and unfold the definition 
of Map on the left-hand-side (3). Then use the fact that Map id xs = xs and 
expand the definition of Ones on the right-hand-side again (4): 






This equality is the same as (2). Because in going from (2) to (3) a Cons 1 
was removed (and thus the step was ’productive’), we can now use (2) to prove 
(4) by a co-inductive argument. This completes the proof. Note that besides 
co-induction also cycle-unfolding is used in this proof. 

4 Conclusions and Related Work 

With the prototype it is possible to prove many interesting theorems about 
Clean-programs in an easy way. These theorems can contribute to making pro- 
grams more reliable. Although there is still a long way to go, the early re- 
sults are very encouraging. The development of a dedicated theorem prover for 
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Clean will continue and we hope to report on some results in the near future on 
http : //www. cs.kun.nl/~maartenm/CleanProverSystem/, 

Related work is described in [3], in which a description is given of a proof 
tool which is dedicated to Haskell. It supports a subset of Haskell and needs no 
guidance of users in the proving process. The user can however not manipulate 
a proof state himself by the use of tactics, and induction is only applied when 
the corresponding quantifier has been explicitly marked in advance. 

Further related work concerns a theorem prover for Haskell, called the Haskell 
Equational Reasoning Assistant [12], which is still under development. This proof 
tool is also dedicated to Haskell and supports Haskell 1.4. Proofs can only be con- 
structed using equational reasoning and case analysis. No other proof methods, 
like induction or generalization, are supported. ERA is a stand-alone application. 
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Abstract. This paper proposes a bottom-up approach for identifying and 
recognizing tables within a document. This approach is based on the paradigm 
of graph rewriting. First, the document image is transformed into a layout 
graph whose nodes and edges respectively represent document entities and 
their interrelations. This graph is subsequently rewritten using a set of rules 
designed for and based on apriori document knowledge and general formatting 
conventions. The resulting graph provides both logical and layout views of the 
document content. 



1 Introduction 

Document Recognition is a process in which the information regarding the 
organization of the document content, i.e. the document structure, is extracted from 
the document image. A document structure must identify the entities on the page, 
e.g. paragraphs and tables, capture their properties, e.g. the number of columns in a 
table, and establish their interrelations, e.g. caption is below the table. This paper is 
primarily concerned with structure analysis beyond character recognition. The 
methodology presented here is equivalently applicable to a document described by a 
Page Description Language (PDL) such as Postscript or Xerox Interpress. 

Graphs are powerful tools for document structure analysis. They provide a 
compact computational abstraction for representing the complex multidimensional 
information embedded in a document structure. Manipulating graph nodes and links 
results in a new graph topology and consequently a new interpretation of the 
document structure. As a result, the tools for graph manipulation can be the basis for 
a computational document structure analysis framework. However, the use of graphs 
for document recognition has classically been limited to experimental systems due to 
the computational complexity of graph manipulation. A notable exception is the 
work Fahmy and Blostein [1]. This work provides a comprehensive computational 
framework for document recognition that has been missing in much of the earlier 
work in that area. Fu [2] has used graph grammars and syntactic pattern recognition 
for recognizing Chinese characters. Bunke [3] has employed similar techniques for 
line-drawing understanding. 
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In this paper, we present a computationally feasible technique for manipulating 
the graph of document images. Although the goal of the proposed system is to 
recognize table structures in a document, it can easily be extended to encompass a 
variety of document recognition tasks. The remainder of this section is devoted to 
presenting a formal terminology for describing graphs and a general technique for 
manipulating graphs based on a set of rules, called graph rewriting. 



1.1 Graphs 

Let Z and A be finite sets of alphabets for node labels and edge labels respectively. 
A graph g = {F, E, /^ , /^ , A] over set S U A is a 5-tuple where V represents a 
nonempty finite set of nodes, E xV represents a finite set of unordered pairs 
of nodes called edges, /^:F ^ Z is an injective mapping that assigns a label to 
each node, f^:E^A is an injective mapping that labels each edge, and 
A denotes the combined set of node and edge attributes. If A^ (p , g is called an 
attributed graph. A subgraph of g is a graph consisting of some nodes of g and 
all the edges of g which connect those nodes. In document processing applications, 
S contains entities such as words, paragraphs and tables, A contains geometric 
and logical constraints such as left_ofQ and refers Jo Q and A contains attributes 
such as bounding box and font type. 



1.2 Graph Grammars & Rewriting 

Graph rewriting is a sequential process where by a subgraph g^ of a host graph g 
is replaced with another graph g^ at each step. Each replacement step involves three 
distinct tasks: first, locating a subgraph isomorphic to g^ in g and detaching it 
fi*om the host graph, next replacing g^ by a graph isomorphic to g^ and attaching it 
to the host graph and finally recomputing the attributes of the new host graph. A 
formal method for carrying out these tasks is based on using graph grammars. A 
grammatical approach provides the necessary fi'amework for expressing the graph 
manipulation rules as well as a strategy for controlling the execution of those rules. 
Graph rewriting is, however, less stringent than the classical grammatical techniques 
since it does not require the ultimate reduction of the productions into a start symbol. 
In what follows, we use the terminology introduced by Rozenberg et. al [4] to 
formally describe graph rewriting as its application pertains here. 

An Attributed Graph Grammar, or AGG, is a five-tuple T = |E,A,5, P, v4| 

where E and A are node and edge alphabets respectively and A is the combined 
set of the node and edge attributes. E is the union of two disjoint alphabets E^. and 
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called terminal and nonterminal alphabets respectively. Terminal entities 
cannot be decomposed to the constituent parts. *5" gE^ is the grammar start 
symbol. P denotes a finite set of grammar productions that are of the form 
p = where gE^ is left-hand side of the production p that 

is a graph, g^is the right-hand side of the production that is a graph, and a ,/3 
and Tj are the embedding relation^ the attribute transfer function and the application 
condition for the production respectively. Each relation a for an undirected graph 

G is a set of form {(yi,<5i, 73 .72.^2.74) Ti ,72.73.74 ^^',8^,82 ea}. 
Also, each attribute transfer ftmction is of the form \A^ A and each 
application condition is a constraint of the form?] \A^T!V. Starting from S , 
the set of all the graphs that can be generated by repeated applications of the 
productions in P is called the language generated by grammar Y . If the left hand 
side of each production is a single nonterminal, the language is said to be context- 
free. 

Let be the remainder graph of G constructed by removing graph g^ and 
all the edges connecting to it from G . Assuming the condition 7 ] is satisfied for the 
production p , the rewriting step can be defined as follows: First, g^ (or a graph 
isomorphic to g; ) is replaced with g^ (or a graph isomorphic to g^). Next, g^ is 
glued into G^^^ using the embedding relation a in the following manner: An edge 
with label 5^ that connects a node V 2 g g^ with label Y 2 to a node V 4 G G^^^ 
with label is replaced by an edge with label 5 ^ that connects a node g g^ 
with label y^ to a node V3 with label y^ .The new graph G' constructed 

as such is said to have been derived from G using the production p , 

G — ^-^G' . Finally, the node attributes for all the nodes in G' are calculated 
using the function . The derivation process can be continued until all the nodes 
remaining in the graph belong to the terminal alphabet E ^ . 

Given a graph G and a grammar F , parsing is the task of determining whether 
G belongs to the language generated by F . There are two general parsing 
strategies. In bottom-up parsing, G is reduced to the start symbol S by reverse 
application of the productions in P whereas in top-down parsing symbol S is 
expanded by forward application of the production in P until graph G is 
generated. Bottom-up parsing requires the replacement of each subgraph g^ in G 
by a graph g^ . Locating a subgraph g^ (or a subgraph isomorphic to it) in G is a 

NP-complete problem commonly known as subgraph isomorphism, i.e. its solution is 
computationally expensive. Top-down parsing is as computationally expensive since 
generally backtracking will be needed when there are productions in P that have 
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the same left-hand side. Parsing can become more complicated if it is required to 
find a best solution amongst a set of possible parses. Indeed, this is the case if the 
underlying grammar T is ambiguous, i. e., more than one sequence of the 
productions can generate the same graph. This is true in document recognition 
applications where there are many possible ways to generate an entity, e. g. a table, 
from a set of related data. 

To document recognition technique described in this paper uses a bottom-up 
parsing approach. The production rules described later in this paper are used in 
reverse order by matching against the entities in a scanned document to determine 
the logical document constructs such as tables, paragraphs, and lists. 

To parse efficiently, restrictions are typically placed on the underlying grammar. 
A common type of restriction is the use of application conditions. The languages 
generated as such are known as controlled languages. Application conditions are 
used to limit the complexity of the search in the host graph by restricting the number 
of nodes or edges that are considered. For example, color or font attributes can 
restrict the search area for locating headings and captions. Prioritizing the 
production set based on the informal knowledge of the application domain is another 
method of reducing the complexity of parsing. The grammars that explicitly 
maintain an ordering of their productions are known as ordered grammars. By 
assigning higher priorities to the productions that describe more typical behaviors in 
an application, the parsing algorithm attempts to recognize those behaviors when it 
encounters multiple possibilities in extending the parse. For example, in a document 
recognition application, it might be desirable to extend the parse horizontally 
(vertically) first when reading a paragraph (a table). Priorities can also be assigned to 
the most specific entities in a hierarchical structure. This is most useful when the 
language is ambiguous. For example, we may decide if an structure is a list and then 
decide if a it is a bulleted list or an index list. 



2 Document Structure Recognition 

In this section, we propose a new system for document structure recognition based on 
a graph rewriting paradigm. It consists of four subsystems, namely segmentation, 
graph construction, entity recognition and graph rewriting. Fig. 1 shows the overall 
system diagram. The order of operations is as follows: First, the segmentation 
subsystem identifies textual and image document parts. Then, the graph construction 
module builds the layout graph of the document using the segmented parts. Next, the 
entity recognition subsystem tags each of the text parts with a label from the 
document node alphabet and finally the graph rewriting system identifies the 
document structure by manipulating the labeled graph. Both scanned raster and PDF 
document images can be used as input to this system. 

The grammatical basis proposed in the design of this system lends itself to a 
modular architecture that enables the interplay between the entity recognition and 
graph rewriting subsystems. By providing an abstraction and a set of well-defined 
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interfaces that enable this interplay, it is possible to provide a control strategy that 
allows an external recognition module to carry out the same task as a grammar 
production. This allows for use of various advance recognizers within a well- 
controlled modular architecture. In the remainder of this section, we discuss the 
above subsystems in more detail. 





Fig. 1. A document recognition system diagram 



structured 

Document 




2.1 Segmentation 

The segmentation subsystem divides the document image into contiguous areas of 
text, line-drawings, images and halftones. Even though there are no restrictions 
placed on the segmentation modules, we require that the output of this stage be a list 
of non-overlapping entities. These entities make up the nodes of the input graph for 
the rewriting system and each will be labeled by a different character of the rewriting 
alphabet as described in Section 2.3. Each entity is required to have a bounding box. 
Other attributes such as font type in case of text, curve type in case of a line-drawing, 
or color map in case of an image can be associated with each entity by the 
segmentation module. The text output of the segmentation subsystem is allowed to be 
at different levels of granularity for different entities. Thus, it is most of often the 
case that the segmentation output contains a mix of individual characters, word 
fragments, lines and text blocks. In addition, since we will be primarily dealing with 
raster text input, we may allow for an OCR module to provide the actual content of 
the text entities. As we will see, this information is not necessary. The required 
segmentation output format and the suggested attributes can also be extracted from a 
PDL input. This allows for a uniform treatment of all document images beyond some 
elementary preprocessing stages. 



2.2 Graph construction 

The input graph for the rewriting subsystem is constructed prior to entity labeling. 
Graph nodes are constructed using the entities from the segmentation stage. Graph 
edges are constructed based on establishing a set of geometric relations between 
entity bounding boxes. For the initial graph, an edge is constructed between a pair of 
nodes if any of the four relations left_of(), right_of(), top_of(), or bottom_of () holds 
between them. Fig. 2 depicts a typical document layout graph where these relations 
are represented by symbols L, R, B, and T. Each node in this graph keeps track of its 
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neighbors on each side in an ordered list. Although, provisions are made to deal with 
imperfections of segmentation, the case of overlapping and shadowing entities is out 
of the scope for the current implementation. The computational complexity of the 

2 

graph construction is on the order of n where n is the number of the entities. 




Fig. 2. A layout graph of a typical document; The subgraph representing two eolumns, labeled 
vl and v2, in a host graph is highlighted 



2.3 Entity recognition 

Prior to graph rewriting, document entities are labeled. The output of the 
segmentation subsystem identifies two fundamental layout entities, i.e., text regions 
and image regions. The entity recognition subsystem further refines the text regions ’ 
labels according to their size. These entities comprise the initial document alphabet; 
they are: {C, W, L, TR, IR}, where C, W, L, TR and IR represent a character, a 
word, a line, a text region and an image region respectively. These entities obey a 
hierarchical structure as shown in Fig. 3. 
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Fig. 3. A typical entity hierarchy in text regions [C= Character, W= Word, L= Line, and TR= 
Text Region] 

The hierarchical structure of Fig. 3 is well suited for a grammatical 
reconstruction. Indeed, there are rules in our current system that enable the 
construction of an entity from its constituents. However, these rules are only 
triggered if more efficient external modules for recognizing the same entities are not 
available. This concept, namely making provisions for an external module to carry 
out the same task as intended by a grammar production, is present throughout our 
recognition system. This allows for a powerful abstraction and a resulting modular 
architecture. For example, when there is a highly efficient liner module, an 
algorithm that can line up characters into lines directly and bypass the word 
recognition stage, it replaces the grammar rules that build lines. The system 
presented here takes advantage of many existing external modules in its architecture. 

In addition to the above refinement, the entity recognition subsystem classifies the 
text regions according to their logical identity. This classification produces the 
additional entity labels {P, COL, TAB, IDXL, JT, UT} where P, COL, TAB, 
IDXL, JT, and UT represent paragraph, column structure, tabular structure, 
indexed list, jagged text and unformatted text region respectively. These seven labels 
further refine and replace the layout label TR in the original alphabet; thus resulting 
in a pre-rewriting document alphabet that contains {C, W, L, P, COL, TAB, IDXL, 
JT, UT, IR}. The above classification is done using external modules designed bases 
on linear and second order statistics as well as neural networks. 



2.4 Graph rewriting 

The goal of the graph rewriting subsystem is to extract the logical structure of a 
document from its layout graph. In this section, we first discuss the rewriting system 
in terms of its four constituents: 1) production rules, 2) embedding functions, 3) 
attribute transfer functions, and 4) application conditions. Next, we present a scheme 
for controlling the rewriting system that is dependent on the order of the application 
of the rules. Finally, we discuss the merits and the shortcomings of the system. 
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2.4.1 Production rules 



Each rewriting production rule, of the form gi ^ specifies an ordered pair of 
graphs where the ordering implies that an isomorphic instance of the subgraph g^ in 
a host graph can be replaced with an isomorphic instance of the graph g^ . For 
example, let’s consider the following rewrite rule: 




Fig. 4. A rewrite rule for eonstrueting a table from two eolumns 



This rule states that an occurrence of a graph of the form represented on the right 
hand side of, where it contains two nodes with column labels and two directed edges 
between the two nodes with labels left and right, can be replaced in a host graph with 
an occurrence of a nonterminal labeled table that has two columns. This rule is said 
to be context-free since the left-hand side of the production only contains a 
nonterminal. An application of this rule to the layout graph of Fig. 2 is shown in 
Fig. 5. 




Fig. 5. A layout graph of a typical document where the rule in Fig. 4 is used to rewrite the 
embedded subgraph representing the two eolumns in Fig. 2 

The node labeled table replaces the graph representing two adjacent columns in 
Fig. 2 as follows: A search is made to locate a single node Vj that has a column label 
in the host graph of Fig. 2, then the left, L, and the right, R, edges of Vj are 
identified. If another node V 2 with a column label adjacent to either edge (the left- 
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hand side of the is symmetric) exits, the entire subgraph containing Vj , V 2 , and the 
adjoining L and R edges is cut off from the host graph and is replaced with a node 
labeled table. To complete the rewrite, the replacement node needs to be connected 
to the remaining host graph using the embedding relations and its attributes need to 
be calculated. These remaining steps are discussed in the coming sections. 




Fig. 6. A rewrite rule for eonstrueting a table from eolumns and assigning a HEADER label to 
a line entity 



A rule can also be used to label the entities in the host graph and provide 
additional edges. These rules are typically denoted as context-sensitive. For example, 
an application of the rule shown in Fig. 6 to the host graph of Fig. 2 results in 
assigning the logical label of HEADER to node N4. This is shown in Fig. 7. Two 
additional edges, HEAD and SUBJ, are added to the host graph that provide 
inferences between the two TABLE and HEADER entities. 
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Fig. 7. A layout graph of a typical document where rule in Fig. 6 is used to rewrite the 
embedded subgraph representing two eolumns and line in Fig. 2 



2.4.2 Embedding relations 

The embedding relations associated with each rewrite rule specify how 

the new subgraph g^ is connected to the remainder graph of the host graph G , i. 
e. , after g^ is removed. Using the notation developed earlier in Section 1.2, the 
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set of embedding relations associated with each rule can be partitioned in two, IN 
and OUT, each containing six-tuples of the form , Si, 73 , 72 ^, 74 )- Let’s 
consider the following embedding for the rewrite rule in Fig. 4. 

The set operators Y , I , * , and , with implied logical definitions of EITHER, 
EACH, ANY, and NOT respectively, have been used in the above expressions to make 
the representation compact. Each embedding relation is interpreted by establishing a 
correspondence between the node labels in the host graph G and the vertices 
indicated in the rewrite rule. 

For instance, the first OUT embedding relation, i. e. (2, L, 1, L, *) , in view 
of the application of the above rule to the host graph of Fig. 2 is interpreted as 
follows: First a subgraph isomorphic to the right-hand side of the rule is located in 
the host graph. This is indicated by the shaded area in Fig. 2. Next, a correspondence 
is made where the vertex labels 2 and 3 in the above expression represent host graph 
nodes and V 2 respectively. Then, (2, L, 1, L, *) indicates that any outgoing 
edge with label L fi-om the node , i. e. 2, of theg^ in the host graph to any node, 
i.e. * , of the remainder graph should be replaced by a new edge with the same 
label L fi-om node 1 of (/. e. the new node in the host graph with label TABLE) 
to that same node. An example of this embedding relation is the replacement of the 
outgoing edge with label L fi-om to N1 in Fig. 2 with the outgoing edge with label 
L fi-om TABLE to N1 in Fig. 5. Other relations in the above expressions can be 
interpreted in a similar manner. For example, (3, L, 2, 1, L, 2) replaces any 
outgoing edge with label L fi^om V 2 in Fig. 2 with an outgoing edge with label L as 
long as that edge is not a directed to node in Also, (2Y3, T, 1, T, *) 
replaces any node in G^^^ fi-om any one of or V2 with the outgoing edge T. 



1 2 3 




9| 9r 



IN = {(2,i,3,l,i:,3),(3,Z,*,l,Z,*),(2,i?,*,l,7?,*),(3,7?,2,l,i?,2),(2Y3,r,*,l,r,*),(2Y3,5,*,l,5,*)} 

Fig. 8. A rewrite rule and the associated embedding relations for constructing a table from 
two adjoining columns 

Most embedding relations can be partitioned into two distinct parts: one that 
establishes the geometric and one that establishes the logical relations between the 
entities. The relations presented for the rewrite rule of Fig. 8 only manipulated the 
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geometric edges. An example of the type of the embedding relations that introduce 
logical edges into the host graph is shown in Fig. 9. The application of the rule in 
Fig. 9 to the host graph of Fig. 2 introduces two additional edges as shown in Fig. 7. 

In practice, the rewrite cost of replacing with in the host graph of Fig. 2 

can be reduced by only rewriting the part of the graph according to the rule in Fig. 4 
and then re-labeling node N4 to HEADER and inserting additional edges connecting 
HEADER and TABLE with labels HEAD and SUBJ accordingly. The two rules 
presented in this paper are a representative set. The table recognition system employs 
other rules that are not listed in this paper due to the limited space. 




OUT= {(2,Z,*,l,T,*),(3,T,2,l,T,2),(2,/?,3,l,i?,3),(3,i?,*,l,i?,*),(2Y3,r,4,l,r,4),(2Y3,r,4,l,ri HEAD,0), 
(2Y3,5,1,5,*),(4,5,2Y3,0,5,2Y3),(4,5,2Y3,0,51 
IN= {(2,Z,3,l,T,3),(3,TAl,T,*),(2,/?,^l,/?,*),(3,i?,2,l,/?,2),(2Y3,r,*4,r,*), 
(2Y3,5,4,l,5,4),(4,r,2Y3,0,T,2Y3),(4,r,*,0,r,*)} 



Fig. 9. A rewrite rule for eonstrueting a table from two eolumns and assigning the label 
HEADER to a line entity; two additional logieal edges, HEAD and SUBJ, are also introdueed 



2.4.3 Application Conditions 

The application conditions associated with each rewrite rule specify when that rule is 
applicable. These conditions are typically expressed as constraints or predicates on 
the node and edge attributes and are typically derived based on a set of common 
document formatting conventions. The conditions on a rule are checked before the 
graph is to be rewritten. In what follows, we describe some of the application 
conditions used in our current system. 

The single condition that is tested prior to the application of each rule g^ g^ 
is to maintain the topological integrity of the document graph stated earlier. This 
condition states that the bounding box of the graph g^ should not overlap with the 
bounding boxes of any of the nodes in . For example, this constraint denies the 
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application of a rule that attempts to rewrite the subgraph containing the nodes 
and N4 in Fig. 2 since the bounding box of the resulting graph will intersect an 
existing node, i. e. node V 2 , in the host graph. This rule was designed to avoid the 
case of overlapping entities in the initial design of the system. This constraint is 
independent of the node labels and only depends on the geometric attributes of the 
nodes. 

Another application condition that imposes a structural constraint on graph 
rewriting is alignment. This constraint checks whether the text lines in a horizontal 
merge or column structures in a vertical merge are aligned. This is accomplished via 
a single abstraction that measures the extent of alignments between two sequences of 
line segments or intervals. The instantiation of this constraint for a horizontal merge 
deals with the sequence of line-height intervals for the text lines in each node and for 
a vertical merge deals with the sequence of column-width intervals for the columns 
in each node. Consider two interval sequences and I 2 . Let each sequence Ij be 

defined as: 

= { V =[^A’ : ^jk < bjk < aj,,, , ^ = 0, A , TV . } j = 1,2 

Then, the alignment metric can be expressed in the following manner: 

Vo N, 

alignment _ cos t{l^J2 ) ^ S ? hn ) 

w =0 «=0 






hm ^ hn 5 hm ^ hn 5 Hm ^ hn ^ 

hm I hn ^ <P 



Fig. 10. A metric for measuring the alignment of two sequences of interval segments 



The symbols I and c represent the interval operators of overlap and 
containment respectively. These operators are designed to be statistically robust 
against document anomalies such as page skew or noise. For example, interval 
overlaps of a few pixels are disregarded. The containment operator has a higher 
precedence than the overlap operator in the above formulation. Application 
conditions can also be designed to check the validity of certain logical combinations. 
For example, a simple predicate has been constructed that prohibits the merging of 
graph nodes if any of them is a paragraph. This reaffirms the perception that a 
paragraph is an atomic logical document entity that does not combine with other 
document entities. Simple predicates are also used to identify such entities as figures 
or halftones. The table recognition system employs other application conditions that 
are not listed here. 
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2.4.4 Attribute transfer functions 

The attribute transfer functions associated with each rule compute the 

attributes of the graph g^ based on the attributes of the graph g^ and the host 

graph G . There are two distinct mechanisms for transferring attributes: inheritance 
and synthesis. Examples of inherited type of attributes include font face of the 
characters and the actual text content of the lines. These attributes can be transferred 
directly to the word entity made form those characters or the paragraph entity made 
from those lines. The transfer mechanism in this case is an identity mapping. The 
synthesized attributes, on the other hand, are constructed using functional mappings. 
The bounding box of an entity is an example of the type of an attribute that is 
synthesized. The synthesis of other attribute types requires more sophisticated 
procedures. For example, merging of two interval sets is required in order to 
construct the column structure of a TABLE entity. For the rewrite rule shown in Fig. 
8, this reduces to adjoining two line intervals. A more general procedure is used to 
construct the resulting column structure when two tables are combined vertically. 

2.4.5 Control strategy 

As discussed in Section 1.2, there are two major strategies for controlling the graph 
rewriting process: application conditions and ordering of the productions. The 
system proposed here utilizes both capabilities. The following three considerations 
are taken in designing the control strategy. First, the order of the productions is 
horizontally biased. In other words, if a horizontal rule is applicable, it will be used 
prior to any vertical rule. This bias was designed based on the empirical knowledge 
that most tables obey the horizontal reading order. Second, if there are any 
application conditions that are common to a set of productions, they are tested prior 
to the application of the rule. Third, whenever possible, parsing is continued at the 
point of last rewrite in order to minimize the complexity of search associated with 
subgraph isomorphism. This is justified since we seek to rewrite the graph not to 
minimize the graph to a single start symbol. If parsing can not be continued at the 
point of last rewrite, a search is made to locate the next eligible node. To comply 
with the first consideration, the productions are partitioned into four categories, each 
corresponding to one of L, R, T or B directions. It is often the case that the same 
production is in multiple partitions. 

The control is carried through a recursive function that terminates when there are 
no more rewrites possible. The engine that powers the control mechanism is the 
rewriteQ function. Fig. 1 1 shows the recursion core of the rewrite engine. 

rewrite ( G , V , production_set) 

/* apply application conditions common to all productions */ 
paragraph( v ); return FALSE; 

/* test each production */ 

for each production p: g^ ^ g^ m production_set 
g^ <— instantiate (right_hand (p)); 
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/* apply production specific application conditions */ 
for each v' in ; IF verical-aligned ( V , v' ) , continue; 

/* if all conditions are satisfied , rewrite the graph */ 
detach {g^,Gy, 
embed {g,,Gy, 

transfer_attributes {gi , g^, G); 
return ( gi ); 

Fig. 11. The recursion core of the rewrite engine 

This function first tests all the common application conditions. These are typically 
conditions that preserve the structural integrity of the graph and are common to an 
entire set of productions. It then goes through an ordered set of productions and 
rewrites the graph based on the first production that is applicable. This function 
returns the rewritten node, in the case of a context free grammar, as the point of 
last rewrite. If is a graph itself, a node of the rewritten graph is returned. 

The rewrite engine starts with testing all the application conditions that are 
common to all productions. For example, if the node at the point of rewrite is a 
paragraph, rewrite is aborted. The engine will then go through the ordered set of 
productions and checks all the production specific conditions. For example, if it is 
required to check for horizontal or vertical alignment between any two nodes, the 
test is performed. Finally, if conditions are satisfied, g^ is detached from graph G , 
gi is embedded in G , and new attributes are transferred. For example, bounding 
box attribute is calculated at this point. 



3 Experimental Results 

Rigorous testing has been performed and the system performance has been 
acceptable in these tests. The system is, however, limited in its functionality for two 
reasons. First, it does not account for overlapping entities. To allow for overlapping 
entities, the production set and embedding functions need to be modified. Second, it 
does not find the best possible answer in the case of an ambiguity. This is due to the 
fact that the order of application of the productions is fixed and no mechanism is 
currently implemented for backtracking. Figs. 15 and 16 show applications of the 
two rules presented earlier to two sample document images recognized by our 
system. The shaded areas represent the column structures of the recognized tables. 
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4 Conclusions 

In this paper, we have presented a system for recognizing tables in documents. It 
starts by building a layout graph from the segmented blocks of a document. The 
graph captures the geometric relations between the blocks. It then proceeds to tag the 
individual blocks with a set of primitive of labels. Finally, it uses a set of rules to 
rewrite the layout graph. The output of this process is another graph whose entities 
are constructed by merging the primitive entities in the layout graph. The rewrite 
rules for this process are designed based on document knowledge and general 
formatting conventions. The system is applicable to a PDF document as well as a 
raster document with proper preprocessing. The system works primarily by using 
geometric cues. The text content of the document is not used. 
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Appendix A 



Appendix A illustrates additional rewrite rules that have been implemented for the 
table recognition system discussed in this paper. 




I ( 2, 4 3 , 1 , L, 3 ) ( 3 , L*,\, L*\ R*\ ( 3 , R, 2,1, 2 ), (2 Y 3, B, 4,1, B, 4 ) 

1(2 Y 3, 4*,l, 4*), ( 4 , 4 40 , 4 2 ), ( 4 , 4*, 0,4*) 



IN = 
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OUT = 



W 3 , r, 4 , 1 , T, 4 ), ( 2 , T,AX T I HEADS)), (4, 7’,*,0, T,*\ (4, 5,2,0, B I SUBJ ,\), 
4,5,*,0,5,*) 



Fig. 12. A rewrite rule for eonstrueting a table with n+1 eolumns by merging a table with n 
eolumns with a eolumn 



IN = 



OUT = 




|(l Y 2, L,* ,0, L,*), (1 Y 2, 5,*,0, R,*\ (l, T, 2,0, T, l\ (l, 5,* ,0, 5,*), (2, B, 1,0, B, i)] 
1(2,7’,*,0,7’,*) J 

f(lY2,Z,*,0,Z,*),(lY2,5,*,0,5,*),(l,r,*,0,r,*),(l,5,2,0,5,2)i(2,5,*,0,5,*),] 

K2,r,i,o,r,i) J 



Fig. 13. A rewrite rule for eonstrueting a table with m+1 rows by merging a row with a table 
that has m rows 




(4,45,0,45) (5,Z,^0,Z,=^), (4, (5,i?,4,0,i?,4} 

(2Y3,44Y5,1,44^)(2Y3,4M,4=^), 

(4 Y 5, 5,*,0, 4*), (4 Y 5, T, 2^,0 , 4 



lN = -{ 
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OUT 



(2, L*X L*\ (3, L,2,l, (3, R*\ (2, i?,3,l, R^\ 

(4, L*A L*\ (5, L, 4,0, L, 4] (5, R,*,0, R,*\ (4, R, 5,1, R, 5} 
(2 Y 3, B,*X B*\ (2 Y3, r, 4Y5,1, T, 4Y5), 
(2Y3,r,4Y5,l,n HEADX), 

(4 Y 5, r,*,0, T*\ (4 Y 5, B, 2Y3,0, S, 2Y3) 
(4Y5,S,2Y3,0,5I SUBJ ,\) 






Fig. 14. A rewrite rule for eonstrueting a table by merging two tables 
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Abstract. The qualitative structure of images is much like the quali- 
tative structure of landscapes. ’Critical points’ of a landscape are the 
summits, the immits, and the saddle points. These points are connected 
through special curves on the surface of the landscape. The new approach 
computes this basic qualitative structure of an image or a landscape 
from the neighborhood structure of a sampled grid by a process called 
monotonic dual graph contraction (MDGC). The vertices of the graphs 
store information about gray level or height as attributes. Edges repre- 
sent surface curves connecting the vertices. MDGC successively removes 
non-extrema from the original graphs while it preserves the connectivity 
between extrema and the connectivity level, a new property expressing 
the least height difference when moving from one extremum to another 
extremum. Since the graph represents a surface it is planar and the dual 
graph is well defined. MDGC performs simplifications such that in one 
graph all local maxima survive and in the dual all local minima survive. 
Hence we call them ’maximum graph’ and ’minimum graph’ respectively. 
The focus in this paper is on the description of the neighborhood and 
the hierarchy of the local extrema of height. Monotonic properties of the 
gray level image are preserved during the contraction process. The im- 
plementation of the approach is described and experimental results are 
discussed. 



1 Introduction 

In this paper the structure of images from the monotonic contraction of a pair of 
dual graphs is described. This method provides an interpretation of properties 
like neighborhoods and hierarchies of features. As application the sampling grid 
of the pixels in a two-dimensional digital gray level image is replaced by a pair 
of dual graphs adapted to the image’s critical points. If the gray levels are inter- 
preted as heights, the image can be regarded as a digital terrain model (DTM). 

The authors gratefully acknowledge the assistence of Roland Glantz in the prepa- 
ration of this paper. This work is supported by the Austrian Science Foundation 
(FWF) under grant S7002-MAT. 

M. Nagl, A. Schiirr, and M. Miinch (Eds.): AGTIVE’99, LNCS 1779, pp. 297-308, 2000. 
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Koenderink[Koe84,Kv97] defined the qualitative structure of a digital terrain in 
terms of: summits as the local maxima of height, immits as the local minima of 
height, topological curves as lines which connect summits or immits with each 
other, and saddle points on topological curves. 

In our approach the structure is computed in two steps: First, the image is 
transformed into an attributed graph, where the vertices represent pixels, the 
edges represent neighborhoods of pixels, and the vertex and edge attributes are 
gray levels. In the main step, this graph is contracted until it consists of (a) 
vertices which represent summits and faces which represent immits; (b) these 
extrema are connected by curves on the surface passing through a saddle. 

The proposed approach has several merits: for reasons of speed the contrac- 
tion is performed in parallel in both the graph and in the corresponding dual 
graph. Furthermore, this dual graph contraction is based on a theory with well- 
known properties [Kro95]. As novelty, the dual graph contraction performed in 
this paper preserves monotonic properties like height differences of critical points 
and, additionally, it results in a compact representation of the structure. 

The structuring of gray level images can also be achieved by other ap- 
proaches. Hereby, watershed transformations are in the center of efficient ap- 
proaches [MR98]. The monotonic dual graph contraction (MDGC) within this 
paper differ from those in several points: 

1. MDGC computes a dual pair of contracted graphs which describe the neigh- 
borhood and the hierarchy of the summits and immits. 

2. Watersheds are represented by a set of pixels. MDGC computes a compact 
representation of a DTM, where the summits and immits are represented by 
vertices and the topological curves are represented by paths. 

3. Watershed transformations have to take into account the plateaus (more 
precisely the behavior of water flow in the interior of a plateau) and this 
requirement must be fulfilled by additional effort. This need not to be done 
in MDGC. 

4. Due to the attributes an explicit height information of the summits and 
immits is provided within MDGC. 

The remainder of the paper is organized as follows: In Section 2 the basic 
concepts of MDGC are defined in detail. The algorithm MDGC and the proper- 
ties are discussed, too. The implementation of MDGC is described in Section 3. 
Afterwards, experimental results are presented in Section 4. We conclude in 
Section 5 with an outlook for future work. 

2 Monotonic Dual Graph Contraction 

Our application, the computation of the image structure, relies on the contrac- 
tion of a pair of dual graphs (G, G). In the following the pair of dual graphs and 
the dual graph contraction are defined. Afterwards, the monotony preserving 
property is provided and the correctness is proved. The basic concepts from the 
field of graph theory are adopted from [TS92]. 
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Fig. 1. (a) The gray levels of the pixels, (b) The maximum graph of the marked 
sub- image (inside the black border), where the vertices are represented as circles. 
The numbers in the vertices and at the edges indicate the vertex values and the 
edge values of the maximum graph. The numbers in the middle of the square 
regions are the attributes of the vertices of the minimum graph. 



The array of image pixels (Fig. 1(a)) is represented by an attributed graph, 
where the vertices and the edges contain additional information [ECS98]. Each 
pixel of the image is represented by a vertex of the graph. A vertex is adorned 
with an attribute ^(*), which is in our application the gray level of the corre- 
sponding pixel (short: vertex value). Vertices are connected by an edge if their 
corresponding pixels are neighbors^. Analogously, the edges have attributes ^(*) 
(short: edge values). Their definition is motivated by the problem of how to get, 
e.g., from one summit to another summit on a topological curve, without de- 
scending into deep valleys. We are looking for a path, the smallest edge value 
of which is maximal with respect to the smallest edge values along the alterna- 
tive paths [GEK99b]. This edge value is called max-eonneetivity level. We first 
introduce formally the dual maximum and minimum graphs, and then define for 
each the corresponding connectivity level. 

Definition 1 (Maximum Graph, Minimum Graph). A maximum graph 
G = {VX,E,^) eonsists of a vertex setV , a mapping ( for the vertex values, an 
edge set E, and a mapping ^ for the edge values if the following eonstraint on 
the attributes are satified: 

\/e= {x,y) e E min{C(x),C(j/)} > ^(e) 



^ 4-neighborhood is used to make the graph planar. 
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The MINIMUM GRAPH G = {V the dual graph of G whieh eonsists 
of a vertex set V , a mapping C, for the vertex values, an edge set E, and a 
mapping ^ for the edge values if the following eonstraint on the attributes are 
satified: 

Ve = {x,y) e E max{C(x), C(F)} < ge) 

The haekground vertex of V is denoted as Voo • 

The duality in the above definition holds for the structure of the graphs but not 
for the attribute values. In our application an image with pixels P and integer 
gray levels L is given (Fig. 1): The maximum graph G = (VX^EX) consists 
of a vertex set V (bijectively mapped to P), a mapping ^ : F — L, an edge 
set E (vertices are connected if their corresponding pixels are 4-neighbors), and 
a mapping ^ : T' — L. Initially ^(e) = ^(u,w;) = min{^(u), ^(tc)} is chosen. The 

minimum graph G = {V Xj ^X) ftie dual graph of G which consists of 

— a vertex set V (bijectively mapped to the faces of G), 

— a mapping ( of the vertices to the smallest edge value of the edges surround- 
ing the face, 

— an edge set E (dual vertices are connected if their faces share a common 
boundary segment), 

— and a mapping ^ : T' — L with ^(e) = ^(e) for all dual edges e G 

For an illustration of dual edges see Fig. 3. Summarizing, a vertex value of the 
maximum graph stores the maximum value of its receptive field, and a vertex 
value of the minimum graph stores the minimum value of its receptive field. 
Notice, the concepts of maximum and minimum graphs enable one to encode 
any features and not only gray levels. 

Definition 2 (Max- Connectivity Level). Given a maximum graph G = 
{VX,EX)‘ Let G{v,w) be the set of all paths between a pair of distinet ver- 
tiees {v,w), v e V and w e V . The max-connectivity level rnaxGL{v,w) 
is the heighest point one has to deseend when moving from v to w: 

maxGL{v,w) = max{min{^(e) G G[v,w)}\G{v,w)}. 

Analogously, a min-connectivity level is defined for minimum graphs. 

Definition 3 (Min-Connectivity Level). Given a minimum graph 
G = (VXyLJX)- Let G{v,w) be the set of all paths between a pair of distinet 
vertiees {v,w), uG F andw G F. The min-connectivity level minGLiv,^) 
of the pair of distinet vertiees (u,u;) is the lowest height whieh has to be elimbed 
when moving from v to w: 

minGL(v,w) = min{max{^(e G C(u,u;)}|(7(u, tc)}. 
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In the following the basic operations for the contraction of graphs are defined 
in three steps: First, the dual graph contraction for planar graphs (Section 2d). 
Then, the monotonic contraction operations for edges and vertices of maximum 
and minimum graphs (Section 2.2). Third, the approach MDGC for the mono- 
tonic dual graph contraction (Section 2.3). 

2.1 Dual Graph Contraction 

The operation of dual graph contraction is defined for an embedded, planar 
graph G and the dual graph G of G. It is controlled by the following decimation 
parameters [Kro95]: 

Definition 4 (Decimation Parameter, Contraction Kernel). Given a 
graph G. A subgraph D of G is a decimation parameter of G, if and only 
if D is a spanning forest of G. The connected components (trees) of D are called 
CONTRACTION KERNELS. The roots of the trees are called SURVIVING vertices. 
All other nodes of the trees are called NON-SURViviNG vertices. If G has a 
background vertex Voo, then Voo must survive. 

In the subsequent operation every contraction kernel of D shrinks to a single 
vertex, the root, within G (Figures 2(a) and 2(b)), while all other connections 
are preserved [Kro95] . Notice that the contraction of an edge requires the deletion 
of its dual edge. 

Definition 5 (Dual Graph Contraction). Given an embedded pair of dual 
graphs (C, G) and decimation parameters De for the contraction of edges in G 
and Df for the contraction of faces in G. Dual graph contraction (DGG) 
consists of two phases: 

1. DUAL EDGE CONTRACTION described by a function 

Ge : {G,G)^Ge[{G,G),De] = (C',^) ,and 

2. DUAL FACE CONTRACTION Gf : {(T,G^) Gf[{(T,G^),Df] = (G^,G^^). 

Figures 2(c) and 2(d) demonstrate an example of the DGG: The contraction 
kernels for the first phase and the results are shown. During the contraction a 
non-surviving vertex is identified with a surviving vertex which is the root of 
the contraction kernel. Note, the dual edge contraction deletes dual edges in G 
and G (Fig. 2(b) and 3). Afterwards the second phase can be executed in order 
to remove degenerated faces. Degenerated faces are, e.g., cycles of length less 
than three. DGG has been shown to preserve the connectivity, the structure and 
the planarity of the graphs [Kro95]. Applications for the DGG are described 
in [GEK99a]. 

In the following we define the generation of decimation parameters for a 
maximum graph and for the corresponding minimum graph. Decimation pa- 
rameters are based on the decision whether an edge of a maximum graph is 
max-contractible or not. An edge of a minimum graph must be min-contractible 
for the monotonic dual graph contraction. 
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(c) 




(b) 




Fig. 2. Part (a) shows an embedded graph (vertices = edges = and 
its dual graph (vertices = edges = ’•••’), where the vertex representing the 
background region and all its incident edges are omitted for sake of simplicity. 
The contraction kernels are marked (’o’ are non-surviving vertices and point 
at the surviving vertices). Part (b) shows the result of the contraction. The parts 
(c) and (d) depict a contraction of the dual graph (’□’ are non- surviving vertices 
and point at the surviving vertices). 
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2.2 Monotonic Contraction Operations for Maximum and Minimum 
Graphs 

Definition 6 (Max-contractible, Min-contractible). Given a maximum 
graph G = [V, Q, E, and a minimum graph G = {V ^ ^ ^ 

and e = {v, w) G E be edges, v and v being non-surviving vertiees, v not being 
the baekground faee, and w and w being surviving vertiees. 

The edge e is max-contractible, if and only if 

C{v) < ge) < C(w). 

The edge e is min-contractible, if and only if 

((-) > ^(e) > C(w). 

As final step of the contraction of an edge within a maximum graph or a min- 
imum graph the edge values of the edges incident to the surviving vertex are 
updated as follows: 

Definition 7 (Max-dual Contraction, Min-dual Contraction). Given a 
maximum graph G = (VX.EX)- ^ MAX-DUAL CONTRACTION of a max- 
eontraetible edge e = {v,w) G E with surviving vertex w and non-surviving ver- 
tex V is a eontraetion of e , e.g. any edge eJ = [x,v),eJ ^ e, beeomes a new edge 
{x,w), and the attributes of the surviving elements remain unehanged. Analo- 
gously is defined: Given a minimum graph G = {VXyGiX). A min-dual con- 
traction of a min-eontraetible edge e = {v,w) G E with surviving vertex w and 
non- surviving vertex v not being the baekground faee is a eontraetion of e , e.g. 
any edge eJ = {x,v),eJ ^ e beeomes a new edge {x,w), and the attributes of the 
surviving elements remain unehanged. 

2.3 Monotonic Dual Graph Contraction 

All the above defined local operations consider edges and vertices of maximum 
and minimum graphs. Finally, the approach MDGC can be formalized as follows: 



Definition 8 (Monotonic Dual Graph Contraction). Given a maximum 
graph G = {VX,EX) ci'^d the eorresponding minimum graph G = 

The MONOTONIC DUAL GRAPH CONTRACTION eonsists of two phases: 

1. Max-dual contraction of {G,G) with max-eontraetible edges of the max- 
imum graph G as seleeted deeimation parameters, and 

2. Min-dual contraction of {G,G) with min-eontraetible edges of the min- 
imum graph G as seleeted deeimation parameters. 

The following property ensures the correctness of MDGC: 
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Fig. 3. Min-dual contraction of e^c = (v^w): the bold path B(v)\{ecc} C C(v^w) 
preserves the max-connectivity level maxCL{v^w). 



Proposition 1. The min-dual contraction preserves the max- connectivity lev- 
els. 

Proof: Given a maximum graph G = (R, and a corresponding minimum 

graph G = Note that G is the dual graph to G. We show that 

the min-dual contraction of an edge e^ = {v^w) E E m G preserves the max- 
connectivity levels rnaxG L{y^w) in G (see Fig. 3): The contraction of “ in G 
goes along with the deletion of its dual edge e^ = {v^w) from G. It suffices to 
prove, that the max-connectivity level rnaxGL{y^w) is not decreased if edge Cx 
is removed. In other words we have to show that rnaxGL{y^w) > ^{cx) for the 
edge Cx which is also a (short) path between v and w. Since the face v is not the 
background face it is surrounded by a closed path B{v) (’boundary’) containing 
edge Cx- Let us consider the edge values of the alternative path i^(T) \{ea^} which 
is also a path from i; to tc. In Fig. 3 this path is depicted bold. 

Initially, the edge values ^(e^) = ^(e^) around a face are never smaller than 
the face value (cf. the initialization of the maximum and minimum graphs and 
Fig. 1). This property is not destroyed by min-dual contraction, since a min- 
contractible edge is always contracted into the face with the smaller value. It 
is also not destroyed by max-dual contraction, since the edge values may only 
increase during update. 

Since MDGC preserves the property that faces cannot receive attributes 
higher than their bounding edges we have ^(^) > C(^) L>r all edges cb € 
B{v) \ {ca^}, and also rnaxGL{y^w) > C(^)* Furthermore, must be min- 
contractible, and consequently, ^(u) is also an upper bound for ^(^) = ^(e^^), 
QED. □ 

A similar proof yields that the max-dual contraction of an edge in the maxi- 
mum graph preserves the min-connectivity levels in the corresponding minimum 
graph. 
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3 Implementation of MDGC 

The algorithm of MDGC has a simple structure: as input a gray level image is 
taken and as output a structure consisting of summits and immits is computed. 
Both contractions, the monotonic and the dual monotonic, are applied to the 
maximum graph and the minimum graph until no further contraction is possible. 
Note, both the min-dual contractions and the max-dual contractions can be 
performed in parallel [Kro95]. The method converges in a logarithmic number 
of steps since the length of the paths between extrema shrinks by a factor of at 
least two at every (parallel) step. 

The implementation of MDGC is based on LED A [MN99, Library of Effi- 
cient Data Structures and Algorithms] and a tool for the dual contraction of 
graphs [KBBS98]. In contrast to this tool within MDGC the contraction is not 
executed with the graphs, merely trees which contain the contraction kernels are 
constructed as follows: The non-surviving edges together with their correspond- 
ing edges in the dual are marked. Before contraction, each graph vertex points 
at a tree consisting of a single tree vertex. The contraction of an edge e is now 
expressed by the linking of the two trees belonging to the end vertices of edge e. 
The root of the new tree is set to the tree vertex the surviving graph vertex is 
pointing at. At each step of the contraction process, the set of surviving graph 
vertices comprises all graph vertices to point at a tree root. A surviving edge, 
however, is represented by the corresponding bridge^ i.e. an edge of the graph, 
which has not been marked yet. The surviving vertices connected by a surviving 
edge are identified via the roots of the trees, the end vertices of the corresponding 
bridge are pointing at. The trees are represented by a collection of trees using 
the LED A data structure dynamic-trees^ where each operation takes 0(log^ n) 
amortized expected time, n being the number of vertices. Working with this 
collection of trees is faster than executing a contraction in a (dual) graph, since 
edges and vertices need not to be removed in the graph and its dual. Einally as 
graphic output, the trees are drawn representing the contracted graph and its 
dual as demonstrated in the following section. 

4 Experimental Results 

The algorithm MDGC is applied to a test image containing two sole immits 
(center and bottom) and a pair of nested immits on the upper left (Eig. 1(a)). 
As a result we expect two sole loops and a pair of nested loops on the upper left 
in the contracted final graph (Eig. 4). This result will reflect the neighborhood 
and hierarchy of the local extrema of height. A part of the initial maximum 
graph is shown in Eig. 1(b). 

Eig. 4 shows the computed topological curves, when neither the maximum 
graph nor the minimum graph is contractible anymore. The line segments repre- 
sent the edges of the trees. Here the tree roots are identified with the surviving 
vertices. 

The union of all non-surviving edges of the contraction trees and all bridges 
(Eig. 4) describe the watersheds. Eig. 5 depicts the contracted final graph and 
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Fig. 4. Topological curves computed by MDGC (surviving vertices = non- 
surviving edges pointing at the surviving vertex = bridge = 




Fig. 5. Contracted maximum graph (vertices = ’o’, undirected (curved) 
edges = ’ — ’) and the surviving vertices ’□’ of the minimum graph (displayed 
without edges). The vertex labels refer to Fig. 1(b). 
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the vertices of its dual. Comparing the final maximum graph (Fig. 5) with the 
gray level image (Fig. 1(a)), we summarize the following results: 

1. Each edge of the final maximum graph represents either a topological curve 
bordering an immit or a topological curve connecting two immits. 

2. The two nested immits close to the upper left corner of the image are repre- 
sented by two nested cycles in the final maximum graph. 

3. The cycles from the previous item are loops, because there exists a single 
saddle point on each of the topological curves bordering the immits. 

4. The fact that the immit on the bottom is almost replenished, is reflected by 
the small differences of the corresponding attributes in the final graph. 

5 Conclusions 

In this paper we have proposed a new approach to the computation of extrema 
within attributed graphs. For the representation the class of minimum and max- 
imum graphs has been defined. The approach MDGC has been applied to the 
structure of gray level images. The structure is represented by a pair of dual 
graphs. It is compact since each vertex of the contracted graphs represents either 
a summit or an immit. The edges of the graphs represent contracted topologi- 
cal curves. The graph contains also the information about the local extrema of 
height on the topological curves. 

Our approach outperforms watershed approaches since the neighborhood and 
the hierarchy of the summits and immits is computed, and additionally, informa- 
tion about topological curves is provided through the attributes of the graphs. 
We believe that MDGC is a powerful technique which has been applied to the 
segmentation of gray level images, and additionally, that image structuring meth- 
ods based on watersheds can profit from. An important topic of future research 
is the use of real data (images). 

The current approach does not cope with noise. A single outlier, e.g. a wrong 
local maximum or minimum, may appear in the final representation. Further- 
more, even small differences in the attribute values result in many unnecessary 
and spurious vertices. Hence, a future goal will be to extend the present ap- 
proach with a concept of “importance” for a given vertex (summit or immit) 
which can be related to the relative differences within a local neighborhood. A 
similar criterion has been used in the scale-space approach of Lindeberg [Lin94] . 

A drawback of our concept goes back to the fact, that the saddles in the digi- 
tal elevation model are not represented as vertices. We cannot properly describe 
saddles, which lead to more than two summits, if one follows the ascending crest 
lines (see Fig. 6). Fig. 6 also makes clear, that the final graph is not unique. The 
bent edge might as well be situated at the left side. Our future work will aim at 
the proper representation of the saddles. For this purpose we will have a closer 
look at the contraction kernels. 



308 Roman Englert and Walter Kropatsch 




Fig. 6. The gray levels of the pixels (left) and the final graph representing the 
image (right). 
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Planning Geometric Constraint Decomposition 
via Optimad Graph Transformations 



Christoph M. Hoffmann^ 
Andrew Lomonosov^ 
Meera Sitharam^^ 



Abstract. A central issue in dealing with geometric constraint systems 
that arise in Computer Aided Design and Assembly is the generation of 
an optimal decomposition recombination plan that is the foundation of 
an efficient solution of the constraint system. For the first time, in this 
paper^ we formalize^ motivate and explain the optimal decomposition- 
recombination (DR) planning problem as a problem of finding a sequence 
of graph transformations Ti that maximizes an objective function subject 
to a certain criteria. We also give several performance measures phrased 
as graph transformation properties by which DR- planning algorithms can 
be analyzed and compared. Using these perfomance measures and for- 
mulation of the problem we develop a new DR-planner which represents 
a significant improvement over existing algorithms. 



1 lotrodoction and Motivation 

This paper shows that a core problem in geometric constraint solving is to find 
a sequence of graph transformations that satisfies certain formal requirements. 
We refer to it as Decomposition-recombination- planning problem to be described 
later. 

A geometric constraint problem consists of a finite set of geometric objects 
and a finite set of constraints between them. Geometric objects include points, 
lines, planes, circles, and so on. Constraints between them include parallel, per- 
pendicular, distance, tangency, and so on. Some of these constraints are logical, 
such as incidence or tangency; others are dimensional such as distance or angle. 
A solution to a geometric constraint problem is a placement of geometric objects 
that satisfies the constraints. For example solving the geometric constraint prob- 
lem shown in the left half of Figure 1 is equivalent to determining coordinates of 
the three points in 2d, given the distances between them. This reduces to finding 
real solutions of a system of quadratic equations. In general solving a geometric 
constraint problem reduces to solving a nonlinear algebraic system over reals. 
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Fig. 1. Geometric constraint problem and corresponding constraint graph 



Industrial relevance. Geometric constraints are at the heart of computer aided 
engineering applications (see, e.g., [11,12]), and also arise in many geometric 
modeling contexts such as virtual reality, robotics, molecular modeling, teach- 
ing geometry etc. For recent reviews of the extensive literature on geometric 
constraint solving see, e.g, [3, 18,5]. 

In particular, the constraint decomposition approach to geometric constraint 
solving is so successful that most of the major CAD systems such as FDEAS or 
Pro/ ENGINEER have licensed a commercial solver based on this principle. This 
solver uses a repertoire of subgraph patterns and construction rules to decompose 
the constraint graph and break it into small subsets that can be solved easily. It 
is highly successful in planar constraint solving, but barely adequate for spatial 
constraint solving. 

As pointed out in [2], the pattern repertoire quickly becomes unmanageable 
when extending the geometric coverage from a simple subset of planar constraint 
configurations to more complex ones. In spatial constraint solving, using a reper- 
toire of patterns is not very satisfactory, as pointed out in [15], and a more gen- 
eral decomposition strategy is needed. This general strategy, first approached in 
[13], is the subject of this paper. It manages to break the barrier that keeps the 
pattern approach to constraint solving from becoming fully effective in spatial 
constraint solving and in more general planar constraint solving. We would not 
be surprised if a commercial implementation would be attempted based on this 
work. The first and third authors have past and ongoing confidential industrial 
contracts related to this work. 



1.1 Need for decomposition- recombination plans 

The major issue in solving geometric constraint problems is efficiency: comput- 
ing the solution of the nonlinear algebraic system that arises from geometric 
constraints is computationally challenging, and except for very simple geometric 
constraint systems such as the example in Figure 1, this problem is not tractable 
in practice without further machinery. The overwhelming cost in a geometric con- 
straint solving is directly proportional to the size of the largest subsystem that 
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is solved using a direct algebraic/numeric solver. This size dictates the practical 
utility of the constraint solver, since the time complexity of the constraint solver 
is at least exponential in the size of the largest such subsystem. 

Therefore, the geometric constraint solver should use geometric domain knowl- 
edge to develop a Decomposition-Recombination (DR) plan (to be formally de- 
fined in section 2.2) for decomposing the constraint system into small subsys- 
tems, whose solutions are recombined thereby simplifying the original system on 
which the decomposition-recombination is applied recursively until the system is 
fully solved. To facilitate the recombination, the small subsystems in the decom- 
position should be geometrically rigid. A rigid or discretely solvable subsystem of 
the geometric constraint system is one for which the set of real-zeroes of the cor- 
responding algebraic equations is discrete (i.e. the corresponding real-algebraic 
variety is zero dimensional), after the local coordinate system has been fixed ar- 
bitrarily. Discretely solvable systems of equations are also called wellconstrained 
or (consistently) overconstrained. A system of equations that is not rigid is also 
called underconstrained. 

Example. There are many ways to fix a local coordinate system in Figure 1, 
one way for example is to place a point, say PI at origin and place the point 
P3 so that the line segment (P1,P3) coincides with the x-axis. After the local 
coordinate system is fixed, the corresponding algebraic equations either have 2 
real solutions (placing P2 above or below x-axis) or no real solutions (for example 
if A^B < C). 

Note. It is important to distinguish “discretely solvable” from “has a real solu- 
tion.” Although overconstrained (or even certain wellconstrained) systems may 
have no real solutions at all, by our definition, since their set of real zeroes is 
discrete, they would still be considered “discretely solvable.” 

An important performance measure of a DR-plan is that the subsystems in 
the decomposition should be as small as possible: as previously mentioned, the 
complexity of solving a subsystem by an algebraic/numeric solver is proportional 
to the size of the subsystem. The optimal DR-plan is the one that minimizes the 
size of the largest such subsystem. 

A problem of finding a DR-plan if one exists is called a DR-problem and we 
differentiate it from the problem of finding the optimal DR-plan. An algorithm 
that solves the DR-problem by constructing a DR-plan for an input geometric 
constraint system is called a DR-planner. To our knowledge, despite its long- 
standing presence, the DR-problem has not yet been clearly isolated or precisely 
formulated, although there have been many prior, specialized DR-planners that 
utilize geometric domain knowledge, e.g.[2, 21, 22, 15], [16,19,6,7,4,10,17], [1, 
23,18,27,26]. 



1.2 Using graph rewriting for solving DR-problems 

In order to solve a DR-problem we employ a geometric constraint graph struc- 
ture (described in the following section) to represent a geometric constraint 
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system. Using this graph structure the DR-problem can be informally described 
as decomposing the entire graph into small subgraphs of certain type and replac- 
ing these small subgraphs by other subgraphs until the resulting graph is small 
enough. Hence, using terminology of [20] it could be said that DR-plan is a se- 
quence of applications of graph productions, where the left hand side is the small 
subgraph to be replaced, the right hand side is the new subgraph that replaces it 
and the embedding transformation specifies the edges or constraints between the 
new subgraph and the rest of the graph. Thus solving the DR-problem requires 
defining a graph grammar that on one hand is general enough to allow rewriting 
of any subgraph that represents a solvable subsystem, and on the other hand 
is meaningful with respect to the geometric context, i.e graph transformations 
should not change solvability of the underlying system. Also, the right hand side 
of the graph production should be smaller than the left hand side, since this 
influences the time performance of the DR-planner. Finally, the total number of 
grammar rules should be small for the computer implementation to be efficient. 

In this paper we describe an efficient DR-planner, that uses a particular graph 
grammar. We would like to note that while we did not use theoretical foundations 
of graph rewriting theory when proving correctness and convergence properties 
of our particular DR-planner, we anticipate that a rewriting framework will be 
helpful for studying general DR-planners. However we are aware of very little 
previous work [8, 24, 25] on applying graph grammars to CAD systems. One 
example is [8] that uses graphs as a data structure, so nodes and edges of the 
graph represent certain objects and relations between these objects respectively. 
Thus the task of adding new objects or modifying relations between objects 
can be stated as graph productions. This allows for natural implementation 
and maintenance of the data structure, however the question of actually solving 
constraint systems is not addressed. 

1.3 Organization of the paper 

Section 2 describes the geometric constraint graph structure that is used for 
solving DR-problems, formally defines a DR-plan in terms of this structure and 
introduces various performance measures for comparison of various possible DR- 
plans. Section 3 describes new DR-planner, called Frontier Algorithm, which 
was systematically developed to excel in the performance measures defined in 
Section 2, as shown in last subsection where FA is compared to previously known 
algorithms. 

2 Using graph transformations for finding DR-plans 

2.1 Geometric constraint graph and degree of freedom analysis 

A geometric constraint system C could be represented by a geometric constraint 
graph. A geometric constraint graph G = (V^E^w) is weighted and undirected 
with n vertices V (corresponding to the geometric objects of C) and m edges 




Planning Geometric Constraint Decomposition 



313 



E (corresponding to the geometric constraints of C); weight w{v) is the number 
of degrees of freedom available to a vertex v and w{e) is the number of degrees 
of freedom removed by an edge e. The number of degrees of freedom of a rigid 
object is roughly the minimum number of independent variables needed to fix 
this object (or its local coordinate system) in space. 

Example. A point in 2d has 2 translational degrees of freedom and a distance 
constraint in 2d determines 1 degree of freedom. See Figure 1 for a geometric 
constraint graph corresponding to the geometric constraint problem described 
in the previous section. 

Note that the constraint graph could be a hypergraph, since geometric con- 
straints that involve more than two geometric objects are represented as hyper- 
edges. 

A vertex-induced subgraph A C G that satisfies 

d(A) = w(e) - Y ’"(C ^ (1) 

eeA vEA 

is called dense. The function d{A) is called the density of A and meaning of 
constant D will be explained in the next few paragraphs. 

The density of a graph could be used to analyze discrete solvability of a cor- 
responding geometric constraint system because of the following arguments. In 
the generic case we assume that if a number of equations in a constraint system 
is greater than or equal to the number of its variables, then this system is dis- 
cretely solvable. Similarly, the genericity assumption for constraint graphs is the 
following: if the total number of degrees of freedom removed by constraints (plus 
a constant number D) is greater than or equal to the total number of degrees 
of freedom available to the objects, then the corresponding constraint system is 
discretely solvable. This is because generically the geometric objects can then be 
placed rigidly with respect to each other by using only the constraints between 
them. The absolute position of these objects in space can be fixed by removing 
D more degrees of freedom. In 2d, D is equal to 3: the number of translational 
and rotational degrees of freedom of a rigid planar object. In 3d, D is equal to 6 
in general: 3 rotational and 3 translational degrees of freedom. If the object and 
its local coordinate system are already fixed with respect to a global coordinate 
system, then D = 0. 

Therefore we call a minimal dense subgraph (i.e one that does not contain a 
proper dense subgraph) discretely solvable since it generically represents dis- 
cretely solvable (i.e wellconstrained or consistently overconst rained) subsystems 
of the corresponding geometric constraint system. A subgraph that is not dense 
generically represents a system that is not discretely solvable i.e undercon- 
strained. 

If the dense graph A is not minimal, the system corresponding to it could still 
be underconstrained, even in the generic case: density of A could be the result 
of embedding an overconstrained subgraph B C A^d(B) > —D. Hence dense 
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subgraphs do not in general represent discretely solvable subsystems unless they 
are minimal. For a nomminimal dense subgraph A to generically correspond to 
a discretely solvable subsystem, A should remain dense even after replacing any 
of its overconstrained subgraphs B by any (other) wellconstrained subgraph of 
density exactly equal to — D. 



2.2 DR-plans in terms of graph transformations 

Consider Figure 2. The top depicts a geometric constraint graph G, where the 





Fig. 2. Geometric constraint graph and a tree of dense subgraphs 



weight of each edge is 1, the weight of each vertex is 2, and the dimension depen- 
dent constant D is equal to 3. One of the plans for decomposing and recombining 
G (and the geometric constraint system that G represents) into small discretely 
solvable subgraphs (representing subsystems), is to decompose into dense sub- 
graphs a = {1, 2, 3, 4}, b = {5, 6, 7, 8}, c = {9, 10, 11, 12}, d = {13, 14, 15, 16}, e = 
{17,18,19,20}, recombine their solutions appropriately into a simplified graph 
so that they can be recombined, say by representing them as vertices a, 5, c, d, e; 
recursively decompose the simplified graph into I — {a, 5, c},// = {d, e}; repre- 
sent and recombine their solutions as vertices /, //; and so on until the entire 
graph is recombined and represented as a single vertex. Hence a DR-plan should 
indicate a sequence of subgraphs or subsystems chosen at every stage whose 
containment relationships are shown at the bottom of Figure 2. 




Planning Geometric Constraint Decomposition 



315 



Formally, a DR-plan Q of a geometric constraint graph G is a sequence of graphs 
Gi. This sequence has the following properties. 

1. Gi = G 

2. Decomposition. For all there is a minimal dense subgraph Sj C Gj 

3. Recombination. Graph G^+i is a modification of G^, constructed by replac- 
ing subgraph Sj by an abstraction or simplification subgraph Ti(Si) which 
induces a transformation TfiGi) = G^+i. This transformation defined for 
all subgraphs of Gj , is also called simplifier (this is analogous to substituting 
the solution of Sj into the remaining equation system). 

4. = G^ 

Typically the DR- plan Q is specified not just by the sequence of graphs G^, but 
also by the corresponding sequence of discretely solvable subgraphs Si and the 
corresponding sequence of graph simplifier transformations Ti. 




Fig. 3. DR-plan 



Gn=T^.i(-(Ti(G,))) 



Figure 3 shows a general DR-plan. Note that there could be many DR-plans for 
a given geometric constraint graph G. These plans depend upon the choices of 
Si and Ti{Si) at every iteration i. In the next subsection, we will formalize what 
these choices could be. 

2.3 Properties of graph transformations 

Graph simplifiers Ti must satisfy the following three natural requirements. 

(1) If A is a subgraph of B then TfiA) is a subgraph of TfiB). 

(2) T^(.4) U Ti{B) is the same as the graph TfiA U B). 

(3) Ti(A) n Ti(B) is the same as TfiA fi B). 

Note. Any subgraph is understood to be vertex induced. The union of (resp. 
intersection) of two subgraphs A and B is the graph induced by the union (resp. 
intersection) of the vertex sets of A and B. 
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According to the definition of the DR-plan, every DR-plan must satisfy following 
four requirements, together called validity. 

(4) Every constraint graph Gj in the DR-plan can be written as Sj U where 

Si is discretely solvable, Si and B 4 do not have common edges. 

(5) For every A C Ti(A) is isomorphic to A 

(6) The initial map Tq is an identity mapping upon the subsets of Gi 

(7) All the pre-images of every S^, i.e for all 1 < j i — 1, 

are discretely solvable. 

Another natural rule for the DR-plans is to prevent the graph simplifiers Ti 
from mapping discretely solvable subgraphs to not discretely solvable ones and 
vice versa. The DR-plan Q of G is discretely solvability preserving (or solvability 
preserving for short) if for all i and for all subgraphs A C Gi, A is discretely 
solvable and whenever A n Si = 0 or A C Si then the corresponding simplified 
subgraph Ti{A) is discretely solvable and viceversa. The DR plan Q of G is 
strictly solvability preserving^ if for all i and for all subgraphs A C Gi, A is 
discretely solvable implies Ti{A) is discretely solvable and viceversa. The DR 
plan Q of G is complete if for all i and for all discretely solvable subgraphs B 
of the subgraphs Si chosen by Q it holds that B — Ti^iTi^ 2 ---Tj{Sj) for some 

i < ^ - 1- 

A conceptual design decomposition F of a geometric constraint graph G is a 
collection of discretely solvable subgraphs Pi C G, which are partially ordered 
with respect to the subgraph relation. A DR-plan Q is said to incorporate a 
design decomposition F, if the sequence of discretely solvable subgraphs Si in 
Q embeds a topological ordering of F as a subsequence (a topological ordering 
is a linear order that is consistent with the natural partial order given by the 
subgraph relation on F). 

In order to define an optimal DR-plan we need to first define the size of 
a geometric constraint graph A and the size of a DR-plan Q. The size of an 
arbitrary subgraph A C Gi = G is equal to ^(^)* The size of an arbitrary 

subgraph A C Gi is computed by adding the appropriate constant D (3 in 2d, 
6 in 3d, as explained earlier) for any of the images of Sj contained in A and 
adding weights of the vertices of A that are not contained in any such image. 

The size of a DR plan of G is the maximum of the sizes of the corresponding 
discretely solvable subgraphs Si. The optimal size of the constraint graph G is 
the minimum of sizes of all DR-plans of G. The optimal DR-plan of G is the 
DR-plan that has size equal to the optimal size of G. The approximation factor 
of a DR-plan Q of the graph G is defined as the ratio of the optimal size of G 
to the size of Q. The optimal DR-planning problem is a problem of finding an 
optimal DR-plan. 

Note. The optimal DR-planning problem is NP-hard. This follows from our 
result in [13] showing that the problem of finding a minimum dense subgraph 
is NP-hard, by reducing this problem to the CLIQUE. The CLIQUE problem 
is extremely hard to approximate [9], i.e, finding a clique of size within a 
factor of the size of the maximum clique cannot be achieved in time polynomial 
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in rij for any constant e (unless P—NP). However our reduction of CLIQUE to 
the optimal DR- planning problem is not a gap-preserving reduction thus the 
polynomial time approximability of this problem is still an open question. 

In addition to the above performance measures for the DR-plans, next we de- 
fine several performance measures for DR-planning algorithms or DR-planners. 
We assume that all DR-planners use randomized choices at each step where an 
arbitrary selection of a vertex or an edge is needed. 

The worst-choice approximation factor of a DR-planner on input graph G 
is the minimum of the approximation factors of all DR-plans Q obtained over 
all possible random choices. The best-choice approximation factor of the DR- 
planner on input graph G is the maximum of the approximation factors of all 
the DR-plans Q obtained over all possible random choices. 

A DR-planner is general if it terminates with a DR-plan when given a dis- 
cretely solvable system as input. A DR-planner is said to have a Church-Rosscr 
property, if it terminates with a DR-plan irrespective of which discretely solv- 
able graph Si is chosen at the stage. Given the Church-Rosser property, at 
each step, a discretely solvable subgraph Si can be chosen greedily to satisfy the 
requirements of the planner. This prevents exhaustive search. 

A DR-planner X adapts to underconstrained constraint graphs G if every 
(partial) DR-plan produced by X terminates with a Gn consisting of a set W = 
{Ai, . . . , A^} of discretely solvable subgraphs (instead of a single subgraph Sn 
in case of a well-constrained graph G), such that W is maximal, i.e no union of 
subset of W gives a discretely solvable subgraph. 

3 A new DR-planner: Frontier Algorithm 

In this section, we present a new DR planner whose development is systematically 
guided by the performance measures discussed in the previous section. This new 
algorithm called Frontier Algorithm (FA) is designed by a priori choosing a 
particular type of graph transformation Ti for the DR-plan. 

FA uses an earlier algorithm developed by the authors for locating the min- 
imal dense subgraphs Si at each stage i. This algorithm is based on a subtle 
modification of incremental network flow. This algorithm, called “Algorithm 
Dense” first isolates a dense subgraph, and then finds a minimal dense subgraph 
inside it, which ensures its discrete solvability. The interested reader is referred 
to [13] for a description as well as implementation results, and to [14] for an ex- 
tensive comparison with prior algorithms for isolating discretely solvable/dense 
subgraphs with respect to performance measures discussed in the previous sec- 
tion. 

The found discretely solvable subgraph Si is simplified or abstracted as fol- 
lows. The subgraph of Si induced by its internal vertices (that are not adjacent 
to any vertex outside of Si) is replaced by one vertex (the core vertex) Ci. The 
frontier vertices of Si (i.e not internal), edges connecting them, and their weights 
remain unchanged. The core vertex is connected to each frontier vertex v of Si 
by a combined edge e whose weight is the sum of the weights of the original edges 
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connecting internal vertices to v. The weight of the core vertex is chosen so that 
the density of Ti{Si) is exactly equal to — D where D is the geometry-dependent 
constant explained earlier. 

For example, consider Figure 4. The graph on the left is the graph with 
all edge- weights being 1, all vertex- weights being 2, the dimension dependent 
constant D in this case is equal to 3. Assume that the discretely solvable sub- 
graph ABCDEF is chosen to be Si at the current stage. Then B and D are 
frontier vertices of A^C^E^F are internal ones. Thus Si will be simplified 
by FA into a graph consisting of three vertices Ci^B^D^ with the weight of B 
and D being 2 (the same as in the weight of edges (ci^B) and (ci^D) 
being 3 (given by w{AB) + w{CB) + w{EB) and w{CD) + w{ED) + w{AD) 
respectively) and the weight of q being 5; whereby the density of Ti{Si) is 
w{ciB) + w{ciD) — w{ci) — w{B) — w{D) = —3, as required for rigid bodies in 
2d. 





Fig. 4. Geometric constraint graph before and after simplification 



Formally, the DR-planner FA could be described by specifying the simplifier 
transformations. Let Si be the minimal dense subgraph of Gi found at stage i, let 
SI be the subgraph induced by the inner vertices of S^ , let FI be the subgraph 
induced by the frontier vertices of S^, and let A be any subgraph of G^. The 
simplifier transformation Ti is defined as follows: 

- If A n SI = 0, then 7) (A) = A 

- If An SI ^ 0, then Ti(A) = (Vta.Eta) where Vta is the union of all 
vertices of A \ SI and all vertices of FI plus the core vertex Ci. The set of 
edges Eta is the union of all the edges of A and of all the edges of S^, with 
the exception of edges that have at least one endpoint in SI. Edges that have 
at least one endpoint outside SI are combined (their weights are combined 
as well). Edges that have all endpoints in SI are removed completely. 

- The weight assigned to q is such that the density of Ti(Si) becomes exactly 
-D. 
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3.1 Performance analysis of FA 

In this section, we state properties of the new DR-planner FA with respect to 
the various performance measures defined in Section 2. 

Proposition 1. FA is a valid DR-planner with the Church-Rosser property. In 
addition f FA finds DR-plans for the maximal discretely solvable subgraphs of 
undereonstrained graphs. 

Proof, If the graph G is not undereonstrained, then it will remain so after the 
replacement of any discretely solvable subgraph by a subgraph of density — D, 
i.e, after a simplification step by FA. Thus, if G — Gi is dense, it follows that all 
of the Gi are dense. Moreover, we know that if the original graph is discretely 
solvable, then at each step, FA will in fact find a minimal dense subgraph Si 
that consists of more than one vertex, and therefore the size of is strictly 
smaller than the size of Gi (using our definition of size) for all i. Thus the process 
will terminate since the entire graph will eventually be minimal dense, and at 
that stage, the termination condition Sn — Gn holds. This is independent of 
which discretely solvable subgraph Si is chosen to be simplified at the stage, 
showing that FA has the Church-Rosser property. 

On the other hand, if G is undereonstrained, since the subgraphs Si chosen 
to be simplified are guaranteed to be dense/discretely solvable, the process will 
not terminate with one vertex, but rather with a set of vertices representing the 
simplification of a set of maximal discretely solvable subgraphs (such that no 
combination of them is discretely solvable). This completes the proof that FA is 
a DR"planner that can adapt to undereonstrained graphs. 

The proof of validity follows straightforwardly from the properties of the 
simplifier mapping. □ 

Proposition 2. FA is solvability preserving^ but not strictly solvability preserv- 
ing in the general case. 



A 



(2j 

B 



C 



D 



E 



W 

c 



D 



Fig. 5. Original graph is dense, simplified by FA is not 



Proof, Consider for example Figure 5. Initially ABC and BCD are dense. After 
ABC has been simplified into EC {C is frontier vertex, E is core vertex), the 
image of dense subgraph BCD is ECD which is no longer dense. □ 

However in geometric applications, situations like this do not arise, i.e where 
the “inner” part BC imposes some constraints on the “outer” part CD. A more 
illuminating analysis of FA can be accomplished by assuming that the input 




320 Christoph M. Hoffmann et al. 



is restricted to graphs that have geometrically meaningful interpretations. The 
dense graph G is said to be geometrically consistent if and only if for any two 
rigid or discretely solvable subgraphs A and B of G such that A contains some 
inner vertices of the union of A and B is rigid as well. 

Note. For example, consider case of vertices representing points and edges rep- 
resenting distances in 2d. Here if any two rigid clusters share internal vertices, 
then they must share more than one frontier vertex, and from 2d geometry, this 
would mean that their union is rigid as well. Similarly in 3d, if two rigid clusters 
share internal vertices, then they must share more than 2 frontier vertices, and 
by 3d geometry, this would mean that their union is rigid as well. 

Proposition 3. If the graph G is geometrically consistent^ then FA is solvability 
preserving and strictly solvability preserving. 

Proof. Let B be a dense graph, and suppose that the dense cluster Sj was sim- 
plified by FA. Then B would only be affected by this simplification if B contains 
at least one internal vertex of Si (recall that frontier vertices of Si remain un- 
changed). But then, by the definition of the simplifier, Ti{B) is the same as 
Ti{B U Si). Now, by definition of T^, Ti{B U Si) is obtained by replacing Si by 
Ti(Si)^ which has weight exactly — D. Due to geometric consistency, B U Si is 
discretely solvable, and a discretely solvable graph remains dense even after any 
of its discretely solvable (well or overconstrained) subgraphs is replaced by any 
(other) subgraph of density exactly — D. Thus Ti{B) = Ti(B U Si) is also dis- 
cretely solvable. □ 

Proposition 4. FA is complete 

Proof. This is because FA finds minimal dense subgraphs at each stage. □ 

Proposition 5. FA has worst-choice approximation factor 0{l/n) (even for 
geometrically consistent graphs) (proof can be found in [H])- 

Proposition 6. The best-choice approximation factor of FA is at least ^ for 
geometrically consistent graphs. 

Proof. Let G be the weighted constraint graph. Let Q be an optimal DR-plan 
of G, let q be the size of Q (i.e the size of every cluster Si simplified under 
the optimal DR-plan is less than q + 1). We will show that FA could produce 
a DR-plan that is “close to” Q. Complete resemblance (Q — Q^) may not 
be possible, since the internal vertices of the cluster S ^ , found by FA at the 
stage, are simplified into one core vertex, thereby losing some information about 
the structure of the graph. However we will show that there is a way of keeping 
the size of within an additive constant of the size of Q. 

Suppose that FA is able to follow the optimal DR-plan Q up to the stage 
A i.e Si = S-. Suppose that there is a cluster Sj in the DR-plan Q such that 
i < j and Sj contains some internal vertices of Si. Therefore the simplification 
of Sj by FA maybe different from the simplification of Sj in Q. However, since 
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the union of Sj and Sj is discretely solvable (due to geometric consistency), FA 
could use Sj — 22 (Si) U Sj instead of Sj. The size of Sj differs from the size of 
Sj by at most D units, where D is the constant depending on the geometry of 
the problem. Hence the size of Q' is at most g + D, and since q is at least D, the 
result follows. □ 

Proposition 7. FA can incorporate a design decomposition of the input graph 
if and only if all pairs of subgraphs A and B in the given design decomposition 
satisfy: the vertiees in An B are not among the internal vertiees of either A or 
B (proof ean be found in [H])- 



3.2 Example operation of FA 

Consider a geometric constraint graph G and design decomposition 
P — {Fi, iA? P 3 , P 4 } of Figure 6 , where the weight of all vertices in G is equal 
to 2 , the weight of all the edges is equal to 1 , the dimension dependent constant 
D is equal to 3. 

For this G and F, the FA planner will construct a DR-plan Q shown in 
Figure 7. Crucial intermediate graphs Gi in the DR-plan output by FA are 
shown in Figure 8 . 

The description of discretely solvable subgraphs Si found at the stage is 
given in the table below (Si. Gore is a new vertex in that replaces the inner 
vertices of Si after simplification, Si.FV is a list of frontier vertices of S^, Si.GP 
is a list previously found discretely solvable subgraphs that comprise S^, Si.OV 
is a list of vertices of G that has been transformed into Si). 



i Si.Gore(weight) Si.FV Si.GP 



,4(2) 
B (4) 



1 
2 

3 

4 0 

5 C (3) 

6 0 

7 



D(2) 
H (5) 



10 E (3) 
11/ (5) 
12 J (3) 



{1,2,3} 

{C2,4} 

{2,9} 

{ 1 , 10 , 11 } 

{ 1 , 12 } 

{12,13, 14} 

{12,14,4} 

{2,14} 

{14,15, 16} 

{14,9} 

{2,9} 

{ 0 } 



{1,2,3} 

{Si, 4} 

{ 2 , 5 , 6 , 7 , 8 , 9 } 

{ 1 , 10 , 11 } 

{54,12} 

{12,13,14} 

{Se,4} 

{52,55,57} 

{14,15, 16} 

{58,9} 

{S9, Sio} 

{Sii,S3} 



Si.OV 

{1,2,3} 

{1, 2,3,4} 

{ 2 , 5 , 6 , 7 , 8 , 9 } 

{ 1 , 10 , 11 } 

{ 1 , 10 , 11 , 12 } 

{12,13,14} 

{12,13,14,4} 

{1,2, 3, 4, 10,..., 14} 
{14,15, 16} 

{14,15, 16,9} 

{1,2,3,4,9,...,16} 

{1,...,16} 



3.3 Comparison of performance 

There are 3 previously known types of DR-planners. One type of DR-planner is 
based on ideas in [2,15,16,21,22]. During decomposition phase DR-planners of 
this type locate subgraphs of certain shape (for example triangles). We denote 
such DR-planners by SR, which stands for “Shape Recognition”. Another type 
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Fig. 6. Input constraint system and design decomposition 




Fig. 7. The structure of discretely solvable subgraphs Si in DR-plan output by FA for 
input in Figure 6 




Fig. 8. Crucial G{ in DR-plan 
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of DR"planner is based on ideas in [1, 23, 19, 18]. DR-planners of this type involve 
finding maximum matching in a bipartite graph formed by geometric objects and 
geometric constraints. We denote such DR-planners by MM, which stands for 
“Maximum Matching”. The third type of DR-planner is based on ideas in [13]. 
This DR"planner behaves similarly to FA during decomposition phase, however 
during the recombination phase it condenses the entire dense subgraph Sj into a 
single vertex. We denote such DR-planners by CA, which stands for “Condensing 
Algorithm” . 

The detailed descriptions of these DR-planners in terms of graph simplifiers and 
proofs for their performance analysis are given in [14]. Here we briefly outline 
main advantages of FA. A DR-planner of type SR can only produce DR-plans 
that require the dense subgraphs Sj to consist of triangles or other fixed reper- 
toire of patterns. Even when restricted to such inputs, a DR-planner of type SR is 
still not general and not complete. A DR-planner of type MM could only produce 
DR-plans that require the discretely solvable subsystems Si to represent rigid 
bodies that are fixed or grounded with respect to a single coordinate system. 
Even when restricted to such inputs, a DR-planner of type MM still cannot han- 
dle underconstrained graphs, cannot incorporate an input design decomposition 
and is not complete. 

In contrast, FA places no restrictions on inputs, it is general, it can handle under- 
constrained graphs, it can incorporate design decomposition, is valid, complete, 
solvability preserving and strictly solvability preserving for the geometrically 
consistent graphs. While all four types of DR-planners have worst-choice approx- 
imation factor of O(^), FA is the only algorithm that has constant best-choice 
approximation factor. 

Acknowledgements. We would like to thank the anonymous reviewers for their 
helpful suggestions. 
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Abstract. Management of development processes in different engineer- 
ing disciplines is a challenging task. The AHEAD system addresses these 
challenges by providing an integrated environment for modeling and 
managing development processes. Products, activities, and resources are 
managed in an integrated way; furthermore, AHEAD supports evolv- 
ing development processes by seamless interleaving of planning and ex- 
ecution. AHEAD is based on programmed graph transformations; tools 
are generated from a graph-based specification. Einally, a wide-spread 
object-oriented modeling language (UML) is employed for acquiring pro- 
cess knowledge from domain experts. 



1 Introduction 

Development of products in disciplines such as mechanical, chemical, or software 
engineering is a challenging task. Costs have to be reduced, the time-to-market 
has to be shortened, and quality has to be improved. Skilled engineers and 
sophisticated tools for performing technical work are necessary, yet not sufficient 
prerequisites for meeting these ambitious goals. In addition, the work of engineers 
must be coordinated so that they cooperate smoothly. To this end, the steps of 
the development process have to be planned, an engineer executing a task must 
be provided with documents and tools, the results of development activities 
have to be fed back to management which has to adjust the plan accordingly, 
the documents produced in different working areas have to be kept consistent 
with each other, etc. 

The AHEAD system (Adaptable and iJuman- Centered ^Environment for the 
Administration of Development Processes) addresses these challenges. AHEAD 
supports the management and modeling of development processes. Its key fea- 
tures are the following ones: 

The work described in this paper was partially supported by the Deutsche 
Eorschungsgemeinschaft (Sonderforschungsbereich 476 “IMPROVE” and Gradu- 
iertenkolleg “informatik und Technik”). 
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1. Products (documents such as requirements definitions, software architec- 
tures, or module implementations), activities (tasks such as design, imple- 
mentation, and test), and resources (developers assigned to tasks and tools 
supporting developers) are managed in an integrated way. 

2. AHEAD takes care of the dynamics of development processes. Due to nu- 
merous factors (e.g., product evolution, feedback, concurrent or simultaneous 
engineering), development processes cannot be completely planned a priori; 
rather, they constantly evolve during execution. Thus, planning and execu- 
tion have to be interleaved seamlessly. 

3. AHEAD is human-centered in that it supports both managers and developers 
through interactive tools. It does not attempt to automate management, as 
it is done e.g. in many workflow management systems. 

4. AHEAD is based on graph transformations. Task nets, version histories, 
product configurations, etc. may be modeled as graphs in a natural way. 
Operations on these graphs are declaratively specified by graph rewrite rules. 
In this way, we obtain a specification of the underlying management model 
at a high level of abstraction. 

5. Tools are generated from graph-based specifications rather than encoded by 
hand. This does not only reduce implementation effort. In addition, proofs 
of the correctness of the implementation with respect to the specification are 
no longer necessary. 

6. AHEAD is adaptable to different application domains. To acquire process 
knowledge from domain experts, we use a wide-spread object-oriented mod- 
eling language (UML) rather than graph rewriting systems. Object-oriented 
process models are automatically translated into graph transformations. 
Thus, the translation formally defines the semantics of process models in 
UML. 

AHEAD is a successor of a management system which was developed in the 
SUKITS project that dealt with development processes in mechanical engineer- 
ing. AHEAD is currently being developed within IMPROVE, a Collaborative 
Research Council that investigates methods and tools for development processes 
in chemical engineering (see [19] for a description of both projects). IMPROVE 
is a research project carried out by engineers and computer scientists at Aachen 
University of Technology. There are strong contacts to industrial partners, in 
particular Bayer AG. The AHEAD system will be evaluated both internally by 
the engineering partners and externally by the industrial partners. 

Within the IMPROVE project, AHEAD is being applied to a complex process 
which deals with the development of a chemical plant for the production of 
Polyamid-6. In this paper, however, we will stick to a simple example from the 
software engineering domain in order to explain the underlying concepts in an 
easily understandable way. 

The rest of this paper is structured as follows. Section 3 provides an overview 
of the AHEAD system. Section 2 describes the underlying model for managing 
products, activities, and resources. Section 4 presents the environments offered 
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by the AHEAD system for supporting process modelers, managers, and devel- 
opers. Section 5 discusses related work. Section 6 concludes the paper. 

2 Background 

The AHEAD system is based on an integrated model for managing products, 
activities, and resources. The respective submodels are briefly described below 
(see [14,24] for more detailed descriptions). 

CoMa (Configuration Management, [23]) supports version control, configu- 
ration control, and consistency control for heterogeneous documents through an 
integrated model based on a small number of concepts. In the course of devel- 
opment, documents such as designs, manufacturing plans, or NC programs are 
created with the help of heterogeneous tools. Documents are related by man- 
ifold dependencies, both within one working area and across different working 
areas. The representation of these dependencies lays the foundation for consis- 
tency control between interdependent documents. Documents and their mutual 
dependencies are aggregated into configurations. Since development processes 
may span long periods of time, both documents and configurations evolve into 
multiple versions. Versions are recorded for various reasons, including reuse, 
backup, and coordination of team work. Consistency control takes versioning 
into account, i.e., it is precisely recorded which versions of different documents 
are consistent with each other. 

DYNAMITE [DYNAMIC Task NTts, [9]) supports dynamic development 
processes through evolving task nets. Editing, analysis, and execution of task 
nets may be interleaved seamlessly. A task is an entity which describes work 
to be done. The interface of a task specifies what to do (in terms of inputs, 
outputs, pre- and postconditions, etc.). The realization of a task describes how 
to perform the work. A suitable realization may be determined only at run time, 
using one of multiple alternative realization types. A realization is either atomic 
or complex. In the latter case, there is a refining subnet (task hierarchies). In 
addition to decomposition relationships, tasks are connected by control flows 
(which are akin to precedence relationships in PERT charts), data flows, and 
feedback flows. 

RESMOD [RESoMTce Management MODel, [15]) is concerned with the re- 
sources required for executing development processes. This includes both human 
and computer resources, which are modeled in a uniform way. Complex resources 
are represented by resource configurations, whose components are connected by 
dependencies. Prior to the execution of some project, resource requirements may 
be planned. To this end, RESMOD provides (abstract) plan resources to which 
actual (concrete) resources may be assigned later on. Einally, RESMOD sup- 
ports multi-project management, which, for example, includes global balancing 
of workloads. To this end, the RESMOD model introduces project resources 
which are allocated from an enterprise- wide pool of base resources. 

An example illustrating these submodels and their integration is given in 
Eigure 1. The figure refers to a sample process from software maintenance that 



328 



Dirk Jager et al. 




Fig. 1. Integrated management of products, activities, and resources 



is used as a running example. In response to a change request for extending the 
functionality of a software system, the system is redesigned, affected and new 
modules are changed and implemented anew, respectively, and all modifications 
are tested in bottom-up order. The task net consists of tasks connected by con- 
trol flows and feedback flows; moreover, task parameters are connected by data 
flows. Parameters refer to versions of documents; the evolution history of each 
document is represented by a version graph. Resources are classified into plan 
resources and actual resources. Developers are assigned to tasks occurring in 
the task net; they are assisted by tools such as a CASE tool, a text editor, a 
compiler, and a debugger. 

3 Overview of the AHEAD System 

An overview of the AHEAD system is given in Eigure 2. The figure is structured 
into four regions^. The horizontal line separates the definition level from the in- 
stanee level] the vertical line is used to distinguish between external and internal 
representations. Eurthermore, there are four environments supporting different 
kinds of users: the modeling environment for domain experts, the PRO GRES 
environment for specification experts, the management environment for project 
managers, and the work environment for developers. 

At the instance level, the AHEAD system offers tools supporting the man- 
agement and execution of development processes. The management environment 
supports project managers in planning, analyzing, monitoring, and controlling 
development projects; the work environment assists developers in executing the 
tasks assigned to them by project managers. Both environments offer exter- 
nal views on the underlying management database. Eor example, PERT-chart 

^ The figure illustrates only the modeling and management of activities. However, 
AHEAD covers products and resources as well. 
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Fig. 2. AHEAD system 
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views on task nets are presented to project managers; developers are supplied 
with task agendas and work contexts providing documents and tools. The man- 
agement environment and the work environment jointly constitute the process 
support environment [8]. 

Internally, the management data are represented by complex graph structures 
such as version graphs, configuration graphs, task graphs, and resource graphs. 
These graphs and the operations provided on them are defined by programmed 
graph rewriting systems. At the definition level, specifications are edited, ana- 
lyzed, and interpreted in the PROGRES environment [21]. Subsequently, code 
is generated to obtain management tools operating at the instance level. Each 
specification is composed of a generic part (meta model), which can be reused 
for all (or at least a large class of) development processes, and a specific part 
(model definition), which incorporates domain- specific knowledge. The generic 
model is modified rarely; a specific model must be supplied for each application 
domain. 

A specific model has to be developed in close cooperation with domain ex- 
perts. AHEAD addresses different engineering disciplines such as mechanical, 
chemical, electrical, or software engineering (the latter of which is used for the 
examples presented in this paper). Clearly, we cannot assume domain experts 
to be familiar with PROGRES. Instead, we are employing a wide-spread object- 
oriented modeling language — the Unified Modeling Language [UML [2]) — 
for the communication with domain experts [11]. Process models described in 
UML are automatically transformed into corresponding domain-specific parts 
of PROGRES specifications [20]. Thus, domain experts are completely shielded 
from PROGRES. Note, however, that the generic models for product, activity, 
and resource management must have been specified in PROGRES beforehand: 
the PROGRES code generated by the UML transformation tool is based on the 
pre-defined specification of the generic part. 

4 Environments 

After having provided an overview of the AHEAD system, we discuss the envi- 
ronments which it offers in a more detailed way below. 



4.1 PROGRES Environment 

The management models introduced in Section 2 are fairly complex. To describe 
them formally, we use attributed graphs and graph rewriting systems. Gomplex 
management configurations may be represented by graphs in a very natural 
way. Using graph rewriting, we may specify all changes to these graphs in a 
uniform way. This includes creation of new versions of documents, construction of 
product configurations, planning and execution of dynamic task nets, assignment 
of resources, etc. 

More specifically, we are using the specification language PROGRES [21] to 
formalize the management model. PROGRES combines concepts from database 
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production _CreateFeedbackFlow 

( SourceT : TASK ; TargetT : TASK ; 

FBType : type in FEEDBACK ; out NewFeedback : FEEDBACK) 



[ FeedbackFlow 




ControlFlow + 



5 ^ = ^5 3 ^ ' 3 





f romSourceT 


4" : FBType 


toTargetT 




1 ^ = ’^1 






2 ^ = ’^2 



folding { ^3^ ^5 }; 

condition [ . . . ] ; 
transfer 4 A active := true : 
return NewFeedback := 4''; 
end; 



Fig. 3. Graph rewrite rule for creating a feedback flow 



systems, knowledge-based systems, graph rewriting, and procedural program- 
ming into a coherent language. 

A graph schema defines the types of nodes, edges, and attributes in a similar 
way as a database schema. Derived attributes and relationships (paths) can 
be defined in a graph schema as well. Graph transformations are specified by 
high-level graph rewrite rules operating on the level of graph patterns rather 
than on the level of single nodes and edges. Rewrite rules can be combined by 
control structures to form more complex transformations. Thus, the procedural 
programming style is supported as well. 

The PROGRES specifications for GoMa, DYNAMITE, and RESMOD are 
both large and complex. In total, the process meta model covers about 200 pages 
of PROGRES; model definitions for specific applications may even exceed this 
size. Eor more detailed information on the specifications of GoMa, DYNAMITE, 
and RESMOD, the reader is referred to [23], [9], and [15], respectively. 

Due to space restrictions, we merely present a single example for illustrating 
the application of PROGRES to process modeling. The example, which is taken 
from the DYNAMITE model, describes the insertion of a feedback flow into a 
task net (Eigure 3). A graph rewrite rule is used to specify this transformation. 
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The rule is supplied with input parameters fixing the source, the target, and the 
type of the feedback flow to be created, and returns the new feedback flow as out- 
put parameter. The left-hand side (shown above the right-hand side) describes 
a graph pattern to be searched in the task graph. Nodes and '2, which are 
fixed by the corresponding input parameters, are connected by a feedback flow 
on the right-hand side. The other parts of the left-hand side define application 
conditions. First, source and target must not yet be connected by a feedback 
flow (negative application condition represented by the crossed node '4). Fur- 
thermore, the source must be reachable from the target by a control flow path 
(double arrow from ' 2 to ' 1 ) because feedback flows must be oriented oppo- 
sitely to control flows. Finally, the parents of source and target must either be 
siblings, or they must coincide (folding clause below the right-hand side). In 
case all application conditions are satisfied, the flow is created and marked as 
active ( transfer part for assigning attribute values). 

The PROGRES environment offers tightly integrated, syntax- aided tools for 
editing, analyzing, browsing, and interpreting specifications (see [21] for details). 
Within the AHEAD system, these tools are used to develop the process meta 
model. In contrast, model definitions are created automatically (see below). 



4.2 Modeling Environment 

Encoding process knowledge into a PROGRES specification is a task that proba- 
bly cannot be mastered by a domain expert. To gather knowledge from a domain 
expert, we need a modeling language that is easier to understand and is much 
more wide-spread than PROGRES. In addition, it should allow for expressing 
knowledge at an informal level. 

We have selected the Unified Modeling Language [UML [2]) to satisfy these 
requirements. UML is a language that serves as a standard notation for object- 
oriented modeling. R offers a comprehensive set of diagrams for object-oriented 
modeling, including e.g. use case diagrams for documenting typical scenarios 
of using the system to be constructed, class diagrams for structural modeling, 
state diagrams for behavioral modeling, collaboration diagrams for describing 
the interactions among objects, etc. 

For the purpose of process modeling, we have adapted and restricted UML 
according to our requirements [11]. For example, we have tailored class diagrams 
such that they may be used to define task nets at the type level. Moreover, we 
are using only a subset of the diagrams offered by UML. So far, we have mainly 
focused on class diagrams for structural modeling, as well as state diagrams and 
collaboration diagrams for behavioral modeling. 

An example of a collaboration diagram is shown in Figure 4. The collabora- 
tion diagram describes an event handler that reacts on the creation of a feedback 
flow. While the graph rewrite rule of Figure 3 is part of the generic process meta 
model, the event handler shown here is specific to the change request process 
introduced earlier. The event handler assumes that a feedback flow has been 
created from the test of some newly implemented module to the redesign task. 
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Collaboration diagram as sem anti cs-de1inition 




Fig. 4. Collaboration diagram for feedback handling 



A collaboration diagram consists of objects and links that are required to 
be available when the method is executed. Additionally, objects and links can 
be created and destroyed during the execution of the specified method. The 
communication between objects can be defined through messages which may 
refine links. In the figure, messages 1 and 2 are used to create task parameters 
for an error report which are connected by a data flow; messages 3-5 are used 
to suspend tasks affected by the feedback and to unrelease the faulty design. 

The formal semantics of UML diagrams is defined by an automatic transfor- 
mation from UML to PRO GRES. The details of this transformation are provided 
in a companion paper [20]. Figure 5 shows the PROGRES code generated for 
the collaboration diagram of Figure 4. Firstly, a graph test is performed. Subse- 
quently, graph transformations are sequentially executed in a transaction. These 
graph transformations correspond to the method calls occurring in Figure 4. 

For process modeling in UML, we use a commercial GASE tool (Rational 
Rose). The transformation tool traverses the UML diagrams and generates a text 
file containing PROGRES code. This text file is then parsed by the PROGRES 
environment. 

4.3 Process Support Environment 

The process support environment operates on a management graph which is 
composed of a product graph, a task graph, and a resource graph. The code for 
operating on the management graph is generated from the PROGRES specifi- 
cation (the PROGRES compiler generates G code). Thus, the domain-specific 
process model defined in UML is transformed in two steps into program code 
driving the process support environment. 

The PROGRES compiler only takes care of the internal data manipulated 
by the process support environment; it is not concerned with the user interface. 
To generate tools from graph-based specifications, we are using the UPGRADE 
framework ( U ni versa! Platform for GPaph- Based Application 

PPvelopment [10]), which is currently under development. UPGRADE is im- 
plemented in Java, based on standard libraries and both public-domain and 
commercial components (ILOG JViews). It mainly focuses on graphical tools. 



334 



Dirk Jager et al. 



test Thandle_CreateFeedback_Event ( FB : FEEDBACK; 

out Redesign : Redesign_Application; 
Impl_Module : Implement_Modul e ; 
BU_Test : Bo ttom_Up_Tes t ; 

Ms peel : ModuleSpecO; 

Mspec2 : ModuleSpecI; ) 




return Redesign 
BU Test 



"1; Impl_Module := " 2 ; 

3 ; Mspecl :='5; Mspec2 



' 6 ; 



transaction Handle_CreateFeedback_Event ( FB : FEEDBACK) = 

(* declaration of local variables for process objects*) 
Thandl e_C reate Feedback_E vent (...) 

CreatePar ameter ( BU_Test , Error_Repo rtO, out ERl) 

& CreatePar ameter ( BU_Test , Error_ReportI , out ER2) 

& CreateDataflow (ERl, ER2 , FB) 

& Suspend ( BU_Test ) 

& UnRelease (Mspecl, I mpl_Module ) 

& Suspend ( Rede si gn ) 
end ; 



Fig. 5. Translation of the collaboration diagram 



but it also supports e.g. tabular and tree representations. Graphical tools provide 
external views on the underlying management graph, hiding all of the techni- 
cal details of the internal representation. Graph transformations defined in the 
PRO GRES specification are offered as user commands. A constraint-based in- 
cremental layout algorithm automatically positions nodes and edges after the 
application of a graph transformation. The user may still improve the generated 
layout manually. 

Using the UPGRADE framework, the process support environment is built 
with minimal effort. Eor example, command menus and windows for entering 
command parameters are generated from the operations defined in the PRO- 
GRES specification. External views on the management graph are defined by 
filtering nodes and edges based on type information. Eurthermore, views may be 
based on derived data (derived attributes and relationships) that may already 
be defined at the PRO GRES level. As a consequence, only small parts of the 
process support environment have to be coded manually. 

Eigure 6 shows a screen shot taken from the management environment. The 
project manager is supplied with a graphical view on a task net for our sample 



AHEAD 



335 




Fig. 6. Management environment 



extension request process. Using such graphical views, the manager may analyze 
the current project state, assign tasks to developers, handle feedback, update 
the plan according to changes in the product structure, etc. 

The work environment (not shown in a figure) provides developers with a 
tabular agenda of tasks to be done. From the agenda, a developer selects a task 
to work on. Then, the task’s workspace is displayed, consisting of all documents 
relevant for this task. External development tools such as CASE tools, editors, 
compilers, etc. may be started on these documents. To this end, the work en- 
vironment offers commands for tool activation. An external tool is started by 
means of a wrapper which supplies the document (s) to work on, prepares the op- 
erating system environment, and invokes the tool with appropriate parameters. 
In this way, developers are shielded from the details of tool activation. 



4.4 Summary of the Tool Construction Process 

Using AHEAD, a domain-specific management system is constructed in the fol- 
lowing steps: 

1. The modeling environment is used to define the domain-specific process 
model in terms of UML diagrams. 

2. The transformation tool generates PRO GRES code from the domain-specific 
UML model. 

3. The domain-specific PROGRES code is compiled into C code with the help 
of the compiler being part of the PROGRES environment. 

4. External development tools are integrated with the AHEAD system with 
the help of wrappers. 

5. The UPGRADE framework is compiled with the generated G code and the 
tool wrappers, resulting in a domain-specific process support environment. 
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The construction of the domain-independent part of the AHEAD system — 

i.e., the “infrastructure” into which the domain-specific parts are implanted — 
involves the following steps: 

1. The PROGRES specifications of the generic models for product, activity, 
and resource management are created with the help of editor, analysis, and 
interpreter tools provided by the PROGRES environment. 

2. The user interface of the management system is developed with the help of 
the UPGRADE framework. This involves the definition of views and tools 
for both managers and developers. 

3. The code compiled from the specification is combined with the user inter- 
face for the process support environment. As a result, we obtain a generic 
process support enviroment, being able to handle generic types of products, 
activities, and resources. 

4. The modeling environment is implemented by adapting a commercial GASE 
tool (Rational Rose). The transformation tool accesses the database of the 
GASE tool and generates a text file containing PROGRES code. 

Note that the generic process support environment can be employed mean- 
ingfully when there is no process knowledge available yet. In such a situation, 
the process support environment can be used in an “ad hoc” mode to gather 
experience that can then be turned into a domain-specific process model. This 
approach considerably reduces the start-up time for applying the process support 
environment. 

5 Related Work 

Numerous management systems have been designed and implemented to support 
the coordination of development processes. Project management systems [13,12] 
support management functions such as planning, organizing, monitoring, and 
controlling. Engineering data management systems (EDM [18]), product data 
management systems (PDM [7]), and software configuration management sys- 
tems (SGM [22,25]) assist in managing the products of development processes 
in different engineering disciplines such as electrical, mechanical, and software 
engineering, respectively. Workflow management systems [16] have been applied 
in banks, insurance companies, administrations, etc. to manage the flow of work 
between participants according to a defined procedure consisting of a num- 
ber of tasks [17]. Einally, process- centered software engineering environments 
(PSEE [4,5,3]) support the execution of software processes, driven by process 
models which define the activities to be executed, inputs and outputs, the re- 
sources required for execution, etc. 

AHEAD differs from these systems with respect to all of its key features 
briefly described in the introduction: 
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1. Existing systems do not equally cover products, activities, and resources. 
Project management systems, workflow management systems, and PSEEs 
primarily focus on activities, while EDM, PDM, and SCM systems address 
product management. 

2. The dynamics of development processes is not adequately taken into account. 
In particular, workflow management systems sharply distinguish between 
build time (definition of a workflow) and run time (workflow execution). 
Changes to workflows at run time are at best supported to a limited extent. 

3. While AHEAD is human- centered and provides interactive management 
tools, PSEEs and workflow management systems tend to automate man- 
agers by replacing them with process programs. This approach does not work 
because of the inherent dynamics of development processes, which requires 
many human decisions during execution. 

4. Most existing systems lack a formal definition of their underlying models for 
managing products, activities, and resources. Concerning the management 
of activities, however, there are several systems which are based e.g. on Petri 
nets, which do have formally defined semantics [1,6]. However, the semantics 
of Petri nets deals only with activities. In contrast, we employ graph trans- 
formations for specifying products, activities, and resources. Moreover, the 
evolution of Petri nets is described outside of the Petri net formalism, while 
the evolution of task nets in AHEAD is also formally described by graph 
trans format ions . 

5. AHEAD is based on a framework for generating tools from graph-based 
specifications. Thus, the underlying process meta model (consisting of CoMa, 
DYNAMITE, and RESMOD) may be changed rather easily with modest 
effort. This does not apply to other systems with hard-coded process meta 
models. 

6. Eor communicating with domain experts, we are using a wide-spread object- 
oriented modeling language. So far, our experiences concerning the expres- 
siveness of UML have been positive. In contrast, numerous specialized pro- 
cess modeling languages in particular have been defined for PSEEs even 
though their underlying concepts are often similar. In addition, process mod- 
eling languages for PSEEs and workflow management systems tend to focus 
on process programming, while AHEAD also addresses the early phases of 
process engineering. 

6 Conclusion 

We have presented a graph-based system for managing and modeling develop- 
ment processes. Currently, the AHEAD system is still under development. We 
expect to complete the implementation in spring 2000. To give an impression of 
the development effort involved, let us present some numbers: The PRO GRES 
system, which has been available for several years and is reused as an important 
component of the AHEAD system, comprises about 700,000 lines of code (loc). 
The specifications of the generic models for managing products, activities, and 
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resources cover about 200 pages of PRO GRES code. The transformer from UML 
to PRO GRES was implemented in 13,000 loc. E in ally, the UPGRADE frame- 
work currently consists of about 40,000 loc, and the specific extensions required 
for the AHEAD system will require about 10,000 loc. 

Before starting our work on the AHEAD system, we implemented a manage- 
ment system for development processes in the mechanical engineering domain 
within the SUKITS project [19,24]. This system relied on predecessor versions 
of the models for managing products, activities, and resources. The SUKITS 
management system was fully functional and was integrated with about a dozen 
tools for design, manufacturing planning, NG programming, etc. Eor evaluation 
and demonstration purposes, the system was applied to several development 
processes, including e.g. the development of a drill. 
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Abstract. Supporting technical development processes through process 
management environments is vital for a project’s success. While pro- 
cess enactment enables a project manager to plan and monitor a pro- 
cess and guides the participating developers, process modeling aims at 
understanding, communicating and reusing process descriptions. Thus, 
requirements for languages supporting process enactment are quite dif- 
ferent from those for languages supporting process modeling. 

In this paper we demonstrate how the task of process modeling can be 
tackled using a standard object-oriented modeling notation, the Unified 
Modeling Language. By transforming the resulting model into the formal 
notation of an underlying generic process model, we support its enact- 
ment. This generic model has been formally specified within the graph 
transformation system PRO GRES. 

In this way we are able to provide suitable languages for process model- 
ing and enactment within one coherent environment. 

Keywords: Process Modeling, Process Enactment, Graph Transforma- 
tions, Unified Modeling Language 



1 Introduction 

The success of technical development processes is dependent on the coordina- 
tion of the participating developers [19]. A process management environment 
supports coordination by providing tools to plan, enact and monitor a process 
model and by providing developers with work contexts and information about 
their responsibilities, deadlines etc. 

To support development processes adequately, a management environment is 
required to be able to deal with their inherent dynamism. This dynamism arises 
through changing requirements, feedback to earlier stages of the development 
process, simultaneous and forward engineering, moved deadlines and shrinking 
budgets. The architecture and functionality of a suitable management environ- 
ment has been presented in [18] and [6]. 

The work described in this paper was partially supported by the Deutsche 
Eorschungsgemeinschaft within the collaborative research center 476 (“IMPROVE”). 
Within this project we cooperate with Bayer AG, Leverkusen, Germany. 

M. Nagl, A. Schiirr, and M. Miinch (Eds.): AGTIVE’99, LNCS 1779, pp. 341-357, 2000. 
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process meta model 

constrains 

process model definition 

constrains 

process model instance 

maps, plans — 

real-world process 



Fig. 1. Modeling framework 

This paper will focus on adequate modeling support within such a manage- 
ment environment. In the context of development processes the modeling frame- 
work of Figure 1 is widely spread. It distinguishes four levels. A process meta 
model provides a process modeling language for process model definitions. The 
latter are schematic, type-level process models and valid instances of the pro- 
cess meta model. Process model instances are valid instances of a process model 
definition and are mappings of the real-world process. With respect to devel- 
opment processes, process model definitions are the means to define reusable 
process specific knowledge, since the instance level is highly dynamic and result- 
ing structures cannot be reused in most cases. 

Within the AHEAD ^ environment dynamic task nets have been developed 
as a process meta model [4,5]. The meta model supports the continous struc- 
tural evolution during process enactment and thus meets the formulated re- 
quirements (cf. section 2). Dynamic task nets have been formally specified in 
the Programmed Graph Rewriting System (PROGRES) [13]. 

The question arises how domain specific schematic process knowledge is fed 
into the management environment to restrict the structure and behavior of the 
instance level task nets. Extending the PROGRES specification of dynamic task 
nets is a very unattractive approach, since it requires an expert on graph transfor- 
mations as a process modeler. Additionally, experience has shown that processes 
cannot be modeled very intuitively. 

In contrast, an object-oriented modeling approach is very attractive, since 
dynamic task nets are evolving structures of interrelated and interacting ob- 
jects. As an object-oriented modeling language the Unified Modeling Language 
(UML) [1] is the natural choice. It bears the implicit advantages that process 
model definitions can be communicated easily to a large number of people and 
that it contains a number of diagrams that are very appropriate for structural 
and behavioral process modeling on the type level. 

However, in providing abstract UML-based process model definitions, we lose 
the ability to enact them. This paper will describe how we can make the two 

^ Adaptable Human- Centered Environment for the Administration of D^evelopment 
Processes 
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ends meet. It will show how the UML can be applied to create process model 
definitions (cf. section 3) and it will describe how these models can be formalized 
by transformation into an extension of the graph grammar based specification 
of dynamic task nets which then contains the meta model and a valid generated 
process model definition. 

2 Dynamic Task Nets 

As a formal process meta model, dynamic task nets have been introduced, which 
provide mechanisms to model development processes and their inherent dynam- 
ics. Modeling a process as a dynamic task net means splitting the overall process 
into subprocesses and to clarify their respective behavior. A task consists of the 
description of what is to be done, the task interface^ and of a description of how 
it is to be done, the task realization. Dynamic task nets may contain complex 
(refined) and atomic realizations. 

Tasks are connected through task relations. Control flow relations introduce 
a temporal ordering on tasks. Feedback flow relations are always directed oppo- 
sitely to control flow relations and are inserted to mark iteration or exception 
steps of a process. 




Fig. 2. Example of a dynamic task net 
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Fig. 3. Graph schema for dynamic task nets 

Outputs to be produced and inputs needed by a task are modeled as pa- 
rameters of a task’s interface. Between parameters data flow relations may be 
introduced which indicate the flow of products through a task net 

Figure 2 gives a small example of a dynamic task net’s structure ^ and its 
evolution during enactment. At the beginning of a software project only little is 
known about the development process. A design task is introduced into the net 
(part i), while the following structure remains unspecified as it depends on the 
design document’s internal structure (part ii). As soon as this is produced, the 
task net can be completed (part iii). 

Output documents of tasks are versioned and can be released on a task-by- 
task basis. Part iv) shows how a version of a design document is produced after 
feedback occurred from the task implementing module B to the design task. This 
new version is at first only selectively released. 

Internally dynamic task nets are represented as graphs of typed nodes and 
edges. A task graph as the internal data structure of a process model instance 
consists of nodes representing tasks, parameters and tokens as references to 
products created during the process. Task relations and data flows are internally 
represented by nodes, because they carry attributes and neither PRO GRES 
nor the underlying graph database support attributed edges. Edges are used to 
connect task, parameter and token nodes. 

In order to restrict the graph to meaningful structures with respect to the 
process met a model, a graph type is defined by a graph schema. The process met a 
model’s graph schema is displayed in Eigure 3. Node class ITEM serves as the 
root class. On the next layer of the inheritance hierachy we mainly distinguish 
between process entity and process relationship types. The node class TOKEN 
describes nodes representing tokens that are passed along data flows. TASK nodes 
own PARAMETER nodes which are either INPUT or OUTPUT parameters. Tasks can 
produce tokens via output parameters and read tokens via input parameters. 
Tasks have an assigned REALIZATION which determines a subnet’s structure and 
serves as a vertical TASK RELATION. Horizontal relationships are CONTROL FLOW 

^ In this case an instance of a standard unconstrained process model definition is used 
to demonstrate the concept of dynamic task nets 
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tr ans act ion STD_AbortTrans ( Task : STD_TASK) = 

((Task. State i_D. ("Active" on "Suspended" on "Planning")) 
and ( (Task.=ToParent=> : STD_TASK. State = "Active") 

QX Task . IsRoot ) ) 

& for a1 1 t : STD_TASK := Task. =CurrentChildren=> : STD_TASK do. 
chocs e 

whe n (t. State in ("Active" ox "Planning" qx "Suspended")) 
the n 

STD_AbortTrans ( t ) 

el se 
eki p 
en d 
e nd 

& Task. State := "Failed" 
en d 

Fig. 4. State transition diagram and its formal specification 

or FEEDBACK FLOW relations. Parameters in turn are connected via DATA FLOW 
relations. 

To consistently manipulate and enact a task net, a set of operations for 
editing and execution are specified as graph transformations and procedures of 
these. Task nets can be edited by introducing new entities and relationships into 
the net or by removing some. Operations to execute a task net deal e.g. with the 
change of task states and the token game between tasks. Please note that editing 
and execution of task nets are both described using a uniform mechanism. This 
allows us to specify the intertwined editing and execution of task nets 

The exec ut ion al semantics of dynamic task nets are based on cooperating 
state machines. Every task has a lifecycle determined by the state transition 
diagram displayed in part i) of Figure 4. The state diagram is formally specified 
in PRO GRES by supplying an attribute State for the node class TASK and by 
specifying a set of transactions, one for each state transition. 

Some standard behavior of dynamic task nets is defined with respect to this 
state transition diagram. 

— State transitions may influence the context of the task performing the state 
change. Abortion of a task leads to the abortion of all of its subtasks (cf. 
part ii). Figure 4). 

— All editing and execution operations are dependent on the current state of 
the task to be manipulated or its parent task. Operations for editing, like 
insertion of a new dataflow, can only be executed if the parent task of the 
subnet in question is currently in states InDefinition or Planning, while 
operations for performing the token game can only operate on tasks being 
in state Active. 

— Tasks can only be activated if all predecessors left state Waiting and can 
only be committed if all predecessors have been successfully completed. 

In order to allow for the handling of dynamic, non-anticipated situations in 
the model, every state transition and every operation is followed by an event (cf. 
SendAbort -statement in Figure 4, part ii). 

^ For a detailed description on the use of graph transformations within the formal 
specification of dynamic task nets see [4] 
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We presented the main features of a generic process model. In order to man- 
age a certain process, domain-specific knowledge has to be provided. A domain- 
specific process structure introduces new task and parameter types and refines 
the generic relationship types. Process behavior has to be defined in terms of ex- 
ecutional constraints in relation to the generic state diagram and specific event- 
handlers. The following section will show how this adaptation of the generic 
process model can be accomplished. 

3 Using UML for Process Model Definition 

Modeling domain-specific development processes based on dynamic task nets by 
extending the meta model’s specification is a rather tedious task [10]. Especially 
for someone who is not an expert on graph transformations but an expert in 
the technical development domain, a considerable abstraction from the underly- 
ing mechanism has to be found. While structural model definition can at least 
take place in a straightforward manner by extending the graph schema, there is 
no intuitive mapping between the concepts of behavioral model definition and 
PRO GRES’ language constructs. Eor this reason it is necessary to provide a mod- 
eling tool within the AHEAD-environment that allows a non-expert on graph 
transformations to model a process. To this end we use the Unified Modeling 
Language (UML). 

3.1 Defining a UML Extension to Map the Process Meta Model 

The UML is not designed in conformance with our process meta model. Thus, 
we would like to restrict the UML’s language constructs to meaningful ones with 
respect to our application domain. However, the UML does not provide a full- 
fledged meta modeling facility. Defining a meta model can only be achieved by 
extending the UML’s own meta model, which would result in a UML-variant. 
The only way to specify a meta model in conformance with the UML is to 
define stereotypes and eonstraints on these. A stereotype is a virtual meta class 
refining one of UML’s own meta classes and thus enables a modeler to give the 
standard modeling elements additional semantics. Providing constraints leads to 
a restriction of possible diagram structures. Using stereotypes for meta modeling 
is not very satisfactory, since many syntactical and semantical constraints cannot 
be expressed. However, this way of ”‘meta modeling”’ is conformant with the 
OMG’s proposal to define UML extensions [15]. A cutout of the extension used 
for our modeling approach is described in this section. 

Eigure 5 i) displays a table with the used stereotypes and the symbols rep- 
resenting them in UML diagrams. UML’s meta class ’’class” is refined by meta 
classes for task, input and output parameter and realization classes. In addition 
UML’s meta class ’’package” is refined to categorize packages related to their 
content. Various stereotypes are established to distinguish between different as- 
sociations. The constraints that have to hold on these association subclasses are 
defined in a separate table in Eigure 5 ii). 
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Fig. 5. Stereotypes of the UML extension 



The table defines the source and target types of associations carrying the 
given stereotype. Additional context sensitive constraints can be defined using 
the Object Constraint Language (OCL) [17] wich is part of the UML. Since 
understanding these constraints requires a detailed knowledge of the UML meta 
model they are omitted here. 



3.2 Structural Process Model Definition 

In our approach the UML is used for process model definition. Resulting pro- 
cess model definitions thus abstract from a multitude of instance level task nets 
(an example of which was presented in Section 2). A process model definition’s 
structure consists of task, parameter and realization classes and their various 
interdependencies. Consequently, we use UML’s class diagrams to model this 
structure. 

Conceptually, we distinguish a task’s interface from its realization. The in- 
terface defines a task’s contract such as its parameter profile and its external 
behavior. It abstracts from possible realizations, one of which can be selected by 
the corresponding actor — e.g. a project manager — at enactment time. 

Figure 6 shows the interface of the task class Develop Software System at 
the top. The shown interface consists of the task class and its composed input 
and output parameter classes. Cardinalities may be defined together with the 
compositions to restrict the number of parameters of a certain class on the 
instance level. The interface is stored in a separate package^. 

A task class composes all its respective realization classes (symbolized by a 
doubly rimmed rectangle). Since the interface abstracts from a set of possible 
realizations, the realization classes are not part of the interface of a task package. 
Rather an individual package is introduced for every realization class. Realization 

^ The use of the UML’s package concept to structure a process model and to encourage 
reuse of model fragments will not be explained in this paper. A detailed explanation 
can be found in [7]. 
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Fig. 6. Task interface and realization 

classes abstract from complex subprocess definitions as shown in Figure 6. These 
allow for the composition of other task classes through control flow (stereotype 
<Ccflow^) and feedback flow (stereotype <Cfback^) associations. The direction 
of these associations is indicated through the rolenames src and trg. For example, 
the Interface Design and Implement Module task classes are connected by a control 
flow association which indicates that implementation of modules cannot take 
place before work on a corresponding module specification was started (however, 
execution may overlap). Analogously, feedback is allowed between the Bottom- 
Up Test and the Interface Design task classes in case of errors during the test. Of 
course, feedback might as well occur in other parts of this subprocess model. 

Control flow and feedback flow associations define potential channels for 
data flow, which is explicitly modeled through data flow associations (stereotype 
<Cdflow^) between parameter classes. Data flow associations can be introduced 
between an input and an output parameter class, with the output class playing 
the role of source. Vertical data flow can be defined between two input or two 
output parameters. This gives the complex parent task the possibility to supply 
the refining tasks with their input data and to receive their results. 

In our example the Design task is supplied with the initial requirements doc- 
ument. There, the requirements are analyzed and a design document is created, 
which is subsequently sent to the Interface Design tasks. After all modules have 
been implemented and successfully tested, the running system is sent to the 
parent task as the result of process execution. 
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Again, cardinalities are used to restrict the number of instances of the classes 
at enactment time. In the example the number of Design tasks is restricted to 
exactly one, while Interface Design and Implement Module tasks may occur in any 
number, which depends on the number of modules to be implemented. 

3.3 Behavioral Process Model Definition 

After we have shown how a process model definition can be structurally specified 
with class diagrams (and packages) we will now turn our attention to behavioral 
modeling. 

One way to model a task class’ specific behavior is to define conditions for the 
transitions of the state diagram. These are formulated in the OCL and provide 
the means to influence the way a task net is executed. Different development 
policies like concurrent and sequential engineering can be realized. 

A cutout of a sample state diagram for the Bottom- up Test task class is 
shown in Figure 7. It contains a condition for the start transition, which may 
be executed only when all inputs are available (the standard behavior does not 
demand this). 

Since the presence of all inputs of the Bottom- up Test if enforced, we can now 
automate their consumption. The UML allows for the definition of actions inside 
of a state to automate execution steps. An action can be triggered by entering or 
by leaving a state. The sample state diagram of Figure 7 leads to the automatic 
consumption of all inputs when the state Active is entered. 

Every method being executed by a task by default sends out a corresponding 
event to its predecessors, successors, children, and parent. The underlying pro- 
cess engine provides an event/trigger mechanism that triggers the execution of 
event handlers in the receiving objects. These event handlers enable task objects 
to react to actions performed in their individual context. 

Figure 7 shows how the set of event-receiving tasks can be restricted through 
send-clauses at specific transitions. The sample state diagram shown in Figure 7 
restricts the target set of the CreateFeedback event to the task’s parent. 

Defining the specific behavior of a process model includes the specification 
of custom event handlers. An event handler can be specified for every task class. 
The UML allows to specify a method’s semantics in any language, like C++ 
or pseudo code. For example, the automatic activation of a task, dependent on 
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[self.inp->forAII(lnputAvailable = true)] 
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r+ Planning + \ 


entry/ 
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''self, pa rent. handle_CreateFeedback_Event(src, trg) 



Fig. 7. Cut-out of a specific state diagram 
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certain data being released by a predecessor, can be modeled with task class 
specific event handlers. However, the expressiveness of these event handlers is 
very limited since the task class’ context is not known at the time of its specifi- 
cation (in fact, a task class may be reused in different contexts, i.e. realizations 
of complex tasks). We therefore allow for the definition of realization class spe- 
cific event handlers. A realization receives all events that are being sent to its 
assigned task. By this way events being sent to the parent of a task net can lead 
to very complex task net transformations since the structure of the subprocess 
is well known to the realization. If processes are enacted multiple times, process 
knowledge grows and many matters can be handled automatically. 

Complex event handlers can be specified through UML’s collaboration dia- 
grams which are used to define the semantics of an event handling method. A 
collaboration diagram consists of objects and links that are required to be avail- 
able when the method is executed. Additionally, objects and links can be created 
and destroyed during the execution of the specified method. The communication 
between objects can be defined through messages which may refine links. 

Figure 8 gives an example of how a CreateFeedback event can be handled 
automatically through this mechanism. The event handler is defined for feed- 
back occurring between tasks of type Bottom Up Test and Interface Design. In 
addition to the feedback flow’s source and target objects we search for an inter- 
mediate task of type Implement Module and some parameter objects. The event 
handling method then creates two parameter objects that allow exchange of an 
error report. Objects marked with the constraint {new} will be created dur- 
ing the execution of the event handler. Destruction of objects can be achieved 
through the constraint {destroy}. By installing a data flow link between these 
parameter objects the feedback flow is refined and an error report can be sent to 
the feedback flow’s target. Links created by the method are marked with the con- 
straint {new} as well. Finally, the tasks’ behavior is specified through messages. 
In the example case, the feedback flow’s source is suspended (message 1), the 
erroneous output document of the feedback flow’s target retracted (message 2) 
and the intermediate implementation task suspended (message 3). 
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4 Formalizing UML-Based Process Model Definitions 

Enacting a modeled process requires the generic model to consider the domain- 
specific structural and behavioral constraints. In terms of the process structure 
it means that only modeled classes can be instantiated and related according 
to the process model definitions in consideration of the modeled cardinalities. 
In terms of process behavior it means that conditions from refined state tran- 
sition diagrams have to be checked and automated actions have to be invoked. 
Modeled event-handlers have to be triggered and successfully completed if their 
pre-condition holds during enactment. To reach these aims, the UML model is 
transformed into an extension of the met a model’s PRO GRES specification (cf. 
section 2), thus providing a formalized interpretable version of the process model 
definition. 



4.1 Transforming the Structural Model Definition 

The structural model definition can be transformed into PRO GRES code by 
extending the meta model’s graph schema with new node types. Each class 
marked with the stereotypes task, realization, input or output is transformed 
into a node type instantiating the corresponding node class. Each association 
marked with a stereotype cflow, fback or dflow is transformed into a node type 
instantiating the corresponding node class in the same fashion. Meta attributes 
(which are schema-level attributes) are used to fix the relations between these 
node types and to constrain the cardinalities. 

In the following we will present examples from the transformation of the 
software development process model definition introduced in figure 6. 

A task class is transformed into a node type as an instance of node class 
TASK (cf. Eigure 9). The inherited meta attributes of the node class are refined 
for the node type to reflect the associations a task class has to other classes in 
the UML model. In particular the attribute DeclaredRealizations fixes the 



node n1 a.s ,s TASK j_s. 3 . ENTITY 
meta 

DeclaredParameters : type in PARAMETER [0 :n] := PARAMETER; 

OblParamet ers : type in PARAMETER [0 : n] := nil; 

MultParameters : type in PARAME TER [ 0 : n ] := PARAMETER; 

DeclaredRealizations : type in REALIZATION [0:n] := REALIZATION; 

end; 

node type Bott om_Up_Test : TASK 
redef meta 

DeclaredParameters : Module ox Module_Spec ox: Running_Sy stem or 
Error_Report; 

OblParamet ers : Module_Spec ox Module; 

MultParameters : Module_Spec qjl . Module; 

DeclaredRealizations : WhiteBoxTest or BlackBoxTest ; 

end ; 



Fig. 9. Transformation of task classes 



352 Ansgar Schleicher 



node type Standard : REALIZATION 
redef met a 

Declaredlnterface : Develop_Software_S ystem 

ChildTa sks : Design on Inter face_ Design on 

Implement_Mod'ule or Bottom_Up_Te st; 

OblChil dTasks : Design pn Interface_Design or Implement_Module; 
MultChi IdTas ks : Inter face_Design qt _ Implement_Module 

or Bottom_Up_Test 

TaskRelations : CF_Des_to_IntDes pr FB_Test_to_I ntDes pr. . . . 
Paramet erRel ation s : DF_MSpec_to_MSpec or. . . . 

end ; 



Fig. 10. Transformation of realization classes 

set of realization types corresponding to the task’s interface and thus reflects 
associations of stereotype may.realize. In the same fashion the set of parameter 
types of this interface is constrained and thus associations of stereotype may_have 
are reflected. The difference is that in this case cardinalities have to be taken 
into account. For this reason three met a attribute are used to fix the set of all 
possible parameter types, the ones with obligate cardinality (1, 1..*) in the UML 
model and the set of parameter types with multiple cardinality (0..*, 1..*). 

Transforming a realization class is very similar. A new node type is intro- 
duced as an instance of the node class REALIZATION and the meta attributes are 
redefined to express the structural constraints defined in the UML model. Fig- 
ure 10 shows the schema extension performed for the realization class Standard 
from Figure 6. In this case meta attributes are used to reflect the associations 
of stereotype may .contain and their cardinalities. In addition the sets of possible 
task and parameter relation types are stored in meta attributes. 

Task and parameter relations themselves are mapped into separate node 
types. In this case meta attributes are used to constrain the source and target 
types of the relation and to define their respective cardinalities. 

When an enacted process model instance is supposed to be structurally ma- 
nipulated, the corresponding graph transformations check the meta attributes 
to enforce structural consistency with the process model definition. Within the 
graph transformation for e.g. creation of a subtask it is checked whether the 
type of the task to be created appears within the ChildTasks attribute of the 
supertask’s assigned realization. 

4.2 Transforming the Behavioral Model Definition 

The generic state transition diagram can be enhanced by project specific tran- 
sition conditions and automated action calls. As we have shown before, each 
transition or generic operation is internally specified as a PRO GRES transac- 
tion. The problem that arises when transforming the behavioral model is that 
we have to map an object-oriented model to a non-object-oriented specification 
language. The generic operations are not directly related to a node class. There- 
fore, they cannot be redefined for instances of a node class in the same manner 
as meta attributes can. 
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Fig. 11. Transformation of a refined operation 

For this reason, we have to simulate the binding of operations to objects 
in the specification. We introduce a new transaction which simulates the late 
binding of operations. It receives a node as an actual parameter and determines 
the operation to call from the node’s type. 

Inside of a transaction reflecting a refined transition the exit-actions of a 
transition’s source state can be invoked, the transition’s pre-condition can be 
checked, the transition fired (by calling the generic transition)^ and Anally the 
entry- act ions of the new state can be invoked. For all other generic operations 
(those that do not trigger a state change) the exit- and entry-actions of the 
source and target states are omitted, because no state change is performed. 

Figure 11 shows how the Start transition modeled in Figure 7 is transformed 
into a PRO GRES transaction and how the flow of control simulates late bind- 
ing. Within the met a model a transaction VI RT .Start is available which calls a 
generated transaction Virt Start. Within Virt Start an entry exists for every 
task type that refines the generic start transition. Within that entry the refined 
transition (e.g. Test .Start) is called, where conditions are checked and auto- 
mated actions invoked. Within the refined transition (operation) the call to the 
generic transition (operation) is placed. 

Graphical definitions of event handlers as presented in Figure 8 are trans- 
formed as follows: Objects and links that are required to be available for the 
event handler’s execution (i.e. a feedback’s source and target) are transformed 
into a PRO GRES graph query. A graph query searches for and returns the graph 
nodes representing the needed process objects. Objects and links created dur- 
ing the method’s execution are created within the graph through the specified 
meta model’s operations of the generic model. All message calls to objects (i.e. 

^ It is very important to call the generic transition as it defines the standard behavior 
of dynamic task nets and performs the state change. The call has to be placed 
between the source state’s exit-actions and the target state’s entry-actions 
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test Thandle_CreateFeedback_Event ( src : Bottom_Up_Test; 

trg : Interface Design; 
out FB : FEEDBACK; 

out impl Module : Implement Module; 
out mspecl : ModuleSpecO; 
out mspec2 : ModuleSpecI; ) 




retu rn FB := '4; impl_module := '2; 

mpspecl "5 ; mspec2 :^'6; 

transaction handle_CreateFeedback_Event ( sr c : Bottom_Up_Tes t , 

trg: Interface Des ign) = 

(* declaration of local variables for process objects*) 
Thandle_CreateFeedback_Event ( • • . ) 

CreateParameter (sr c. Error ReportO, out erl) 

& Cre ateParameter (tr g, Error_ReportI, out er2) 

& CreateDataflow (erl , er2, FB) 

& Suspend (src) 

& UnRelease (mspecl , impl_Module) 

& Suspend (impl_Module) 
end ; 



Fig. 12. Transformed event handler 



messages 1-3 in Figure 8) are transformed into corresponding calls to the met a 
model’s operations. 

The searching of an object structure and implanting of an enhancement to 
this object structure could best be realized as a graph transformation within 
PRO GRES. However, the creation and deletion of process objects through a 
graph transformation would ignore the semantics of dynamic task nets which 
are contained in the base operations. 

Figure 12 shows a sample graph query and transaction for the collaboration 
diagram in Figure 8. Starting with the node for representing the created feedback 
flow, nodes representing the feedback flow’s source and target task, the affected 
tasks and significant parameters are extracted from the graph and returned to 
the transaction which calls the appropriate base operations on these objects. 



5 Related Work 

Our approach combines the standard object-oriented modeling language UML 
with graph transformation. To our knowledge this approach is unique with re- 
spect to process management environments except for the predecessor of this 
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approach, called MADAM®. MADAM introduced an own set of (visual) lan- 
guages for process model definition as an abstraction of PRO GRES. However, 
we realized that many of MADAM’s features could be mapped into diagrams of 
the UML. Using the latter gives us the benefit of easier communication of models 
to others. Additionally, we were able to further raise the level of abstraction. 

However, there exist other approaches that transform high-level process mod- 
els to an underlying formal process programming language, such as ESCAPE [8]. 
ESCAPE is based on OMT which is one of the predecessors of the UML. It con- 
sists of an object-model (EER-diagram) to model a product structure on the 
type level, a coordination model (statecharts) for behavioral modeling and an 
organization model (tables). These models are transformed into rules of the 
rule-based process programming language MERLIN [11], which then executes 
the rules through forward and backward chaining. 

Besides these approaches based on an indirect execution paradigm there exist 
approaches supporting direct execution like those based on procedural program- 
ming [14] or Petri Nets [2]. A process is modeled by programming in the directly 
executable languages which are usually on a very detailed level. Abstraction from 
the paradigm used and visual modeling are not supported. 

Other publications introduce ways to model business processes with UML 
using activity diagrams [9,16]. While activity diagrams have been invented to 
model workflows, they are very inadequate for modeling development processes 
as these are hard to predict and evolve continously. Modeling development pro- 
cesses in class diagrams is more adequate since processes consist of dynamically 
changing configurations of interacting objects. 

With EUJABA [3] there exists an approach that allows to model graph trans- 
formations, called story patterns, and graph schemas in UML. Control flow is 
specified using activity diagrams. Specifications are executable but EUJABA 
is still not a suitable approach for software process modeling: Having specified 
a generic model using EUJABA, domain-specific extensions would have to be 
specified as schema extensions for the structural model (where inheritance of 
associations is unwanted) and activity diagrams for the behavioral model. How- 
ever, within an activity diagram e.g. a complex task net transformation could 
not simply be modeled as a story pattern because data abstraction would be vi- 
olated. Rather methods of various objects would be called. This approach is the 
equivalent to defining a specific process model in PRO GRES itself and strongly 
counteracts our aim to use UML for software process modeling but still offer a 
process modeler language constructs suitable to the modeling domain. 

6 Conclusion and Future Work 

In this paper we presented dynamic task nets and their formal graph transforma- 
tion based specification. Dynamic task nets were developed to support a process 
manager in guiding and monitoring development processes and to support tech- 
nical developers in coordinating their work. Customizing dynamic task nets to 
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a domain-specific process and providing process knowledge can be achieved by 
using the UML. Enaction of dynamic task nets takes the modeled structural and 
behavioral constraints into account, because the UML model is transformed into 
an extension of their formal specification. The graph schema is extended with 
specific types and the generic operations can be refined type-specifically. 

Benefits of this approach are that we can offer a high-level language for defin- 
ing process models without losing the ability to enact them. Using the UML 
further simplifies process modeling as it is widely spread. Using graph transfor- 
mations for the specification of a the syntax and semantics of a process meta 
model on the other hand, enables us to formalize execution and manipulation of 
task nets with a uniform mechanism. The occuring gap between UML models 
and enaction support is closed by providing a mapping between informal (UML) 
and formal (PRO GRES) process model definitions as described in section 4. 

A process can be modeled comfortably using the commercial CASE tool 
Rose from Rational. After the model has been transformed into a PRO GRES 
specification, PRO GRES’ code generation mechanism and the associated frame- 
work for graph based applications can be used to build a domain-specific process 
management environment. As a proof of concept, the transformation has been 
implemented as an extension to Rose. 

Experience shows that model definition and enaction of instances cannot 
necessarily take place sequentially. Rather, a process model definition has to be 
adapted, because process requirements changed, process deadlines have passed 
etc. Eor this reason the environment has to support evolution of the process 
model definition. This allows a process modeler to change the model definition 
at enactment time and propagate these changes to the instance-level on demand. 
The state of process execution has to be preserved during the triggered task 
net reorganization. Problems arise through the incapability of the PRO GRES 
environment to deal with graph schema changes, but conceptual solutions have 
been found to overcome this deficit. 
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Abstract. This position paper weighs the benehts against the prob- 
lems of using a graph rewrite system for the formal specihcation of an 
integrated software engineering model and for its implementation using 
the same graph rewrite system. The integrated software engineering ap- 
proach, called GRIDS^ , has been motivated by the shortcomings of soft- 
ware engineering support for real-life software projects. It is based on 
the formal integration of software engineering aspects for the automatic 
construction and well-dehned manipulation of situational project frame- 
works. GRIDS uses the graph rewrite system PROGRES for the formal 
specihcation of the concepts and for their prototypical implementation. 
Without claiming to cover the entire held of graph rewrite systems, the 
experiences of this particular, graph-based approach are used as example 
for a discussion about the adequacy, the benehts, but also the shortcom- 
ings and the problems of applying a graph rewrite approach to realize 
automated software and method engineering support. 



1 Integrated Software Engineering 

The necessity and the benefits of strnctnred, comprehensive, and sophisticated 
software engineering for large-scale software system development is since long 
not snbject of discnssion anymore (at least in the academic community). The 
software engineering task is usually sub-divided into three steps: identihcation of 
the crucial software engineering aspects, definition of appropriate models for 
each crucial aspect, and acquisition or in-house development of adequate tools 
to support the modeling activities and to enact the use of the models. 

The classical mission of software engineering research is to provide those 
“appropriate” software engineering models. To fulfill this mission, the complex- 
ity of the software engineering task is broken down into its different software 
engineering aspects which are then investigated separately . Consequently, this 
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proceeding provides models for individual software engineering aspects, as eg. 
software process models, system architecture models, configuration management 
models, requirements engineering models. 

Focusing on a single software engineering aspect may lead to sophisticated 
partial software engineering models that provide more insights and deeper 
understanding about these particular aspects. But partial software engineering 
models are not sufficient for real-life software projects, because their scope is 
too limited. They fail to support comprehensive real-life software projects ade- 
quately, because the development team has to work with a set of unrelated or 
only implicitly related models. Partial software engineering models do not cap- 
ture the various complex interactions, relationships and dependencies between 
the different crucial software engineering aspects, thus provide potentially inco- 
herent models. Furthermore, they often do not support a modeling and control 
of project evolution^ thereby increasing the lack of integration between the dif- 
ferent views onto the software system and project information during the course 
of the project. 

The GRIDS project that resulted from the observations described above was 
triggered by TNO-IAG, a Dutch geo-scientific organization for contracted re- 
search and development which recognized that more than tuning their software 
and hardware product technology, improving their software process technology 
would help improving the development of their geo-scientific software systems. 
The cornerstones for the project were: 

— Increase of the flexibility in deflning partial software engineering 
models: A hierarchy of several levels of meta-models enables to capture 
consistently the different models used in software engineering on each level, 
ranging from general meta-models for software engineering aspects to con- 
crete project frameworks. 

— Integration of different software engineering aspects into one model: 

GRIDS defines an assembly mechanism that allows a method engineer 
to define easier-to-master isolated partial software engineering models which 
are then automatically assembled to integrated draft software project frame- 
works. The integrated project frameworks are networks of software engi- 
neering fragments which comprise comprehensive project information and 
which capture relationships and dependencies that exist between different 
software engineering aspects. 

This constructive approach allows to try-out and vary different models for 
each software engineering aspect as building blocks to come to a suitable 
situational project framework, and to reuse these building blocks later for 
other projects. 

— Modeling and control of project evolution: The range of GRIDS mod- 
eling operations formally specified does not stop at the definition of partial 
models and the construction of project frameworks. The evolution of real- 
life projects make it necessary to update and to modify a project framework 
in a well-defined and controlled way, especially when the not so obvious 
relationships between different software engineering aspects are affected. 
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The aim to capture and model sound software engineering activities is an 
important part of GRIDS, too, as all construction and project manipulation 
actions are formally specified and implemented, too. 

— Adequate tool support: The size and complexity of real-life software 
projects require tools to support and enact the solution for the first three 
requirements, and to present them to the project members in a suitable way. 
The tool has to hide its formal base in favor of a comfortable user in- 
terface that presents its concepts in a way that is natural to the software 
project team members. 




Fig. 1 . A simple integrated software engineering model 



Figure 1 shows an example of the 3D-M^. The 3D-M is an instance of the 
generic GRIDS meta-model, investigating the integration of the three software 
engineering aspects (called “dimensions”) software process, system architecture, 
and system views. These three dimensions have been identified as crucial for 
the technically-oriented organization of software engineering activities. Figure 1 
shows a simplified project framework, automatically generated from three par- 
tial software engineering models (the shaded areas) according to a configurable 
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assembly mechanism. The project framework consists of eight software engi- 
neering fragments, each of which comprises information from the three partial 
models (indicated for one sample fragment by the three dotted “ToConstitnents” 
links). The relationships within each partial model (“PrevionsTo” , “Uses” etc.) 
are propagated to the appropriate fragments of the project framework. Along 
with additional project information (eg. abont resonrces and deliverables) not 
shown here, the framework is set to play a central role in the project execntion 
and control. It provides a navigable graph of software engineering activ- 
ities with comprehensive information abont the “topology” of the project and 
its inter-aspect dependencies. 

The concepts of GRIDS are not explained in more detail here, becanse in this 
context, they serve only to explain the motivation to nse of a graph rewrite sys- 
tem as basis for their implementation. A comprehensive presentation of GRIDS 
can be fonnd in [Zam96,Zam99]. The benefits of an integrated software engi- 
neering approach have since then been confirmed by observation made in large 
software projects at the Private Networks Division of Bosch Telecom in Germany. 

2 Formalization and Implementation of GRIDS Using 
PROGRES 

Static and dynamic parts of the 3D-M have been specified formally nsing the 
graph rewrite system PROGRES [Sch96]. PROGRES has been developed within 
the IPSEN^ project which started in the mid 80s [Nag90] . The goal of the project 
is to provide an integrated environment of interactive and incrementally working 
tools that snpport a large variety of the activities that occnr dnring conception 
and realization of a software system. The core of PROGRES is a visual pro- 
gramming language with the following design goals: 

— nse of a graphical syntax where appropriate bnt withont exclnding textnal 
syntax when it is more natural and concise 

— distinction between data definition and manipulation, and use of graph class 
declarations to typecheck graph manipulations 

— relief users from the task to guarantee confiuence of defined rewriting systems 
by keeping track of rewriting conflicts and backtracking out of dead-end 
derivations 

— support also imperative programming of rule application strategies, not only 
relying on the rule-oriented programming paradigm for all purposes. 

Additionally to the language concepts, a PROGRES system offers an inte- 
grated set of tools that made PROGRES a very advantageous selection for the 
rapid prototyping of GRIDS and its specialization, the 3D-M: 

— a syntax-directed editor for visual and textual specifications of graph schemes 
and graph transformations, including an incrementally working pretty-printer 
and a layout editor 
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— a browser for finding declarations or applied occnrrences of given symbols 

— an incrementally working type checker which detects all inconsistencies with 
respect to the PROGRES langnage static semantics, explains highlighted 
errors, and gives hints for their correction 

— an integrated interpreter which translates PROGRES specifications incre- 
mentally into intermediate code that is execnted on an abstract graph rewrit- 
ing machinery 

— a Tcl/Tk-based graph browser for monitoring manipnlated graphs dnring an 
interpreter session 

— a cross-compiler from PROGRES to Modnla-2 and C code that generates 
execntable prototypes from the formal definition of the graph scheme and 
the dynamic operations 

— a nser-interface generator which prodnces Tcl/Tk code and snpports rapid 
prototyping of interactive graph-manipnlating tools 




Fig. 2. A screenshot of the GRIDS prototype 



Altogether, the PROGRES specification langnage and system are nsed 

— for describing process modeling, version control, and confignration manage- 
ment tools [Wes92,SW93,SWZ96,HJKW96] 
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— to support the definition of system architectures through the use of design 
patterns [Rad99] 

— for defining the semantics of a visual database query language [AE96,Rek94] 

— for modeling and managing of software development processes [JSW99,Sch99] 



An important and determining characteristic of the GRIDS approach is that 
it provides a graphical tool with which the project framework “topology” and 
information can be manipulated interactively and visually. This leads to 
the following important design issues: The actual project (graph) database has to 
be visualized graphically, and the user interface has to offer the formally defined 
software engineering actions to the users of the project framework. 

Figure 2 shows a screenshot of the executable PROGRES rapid prototype 
automatically generated from the formal specification of GRIDS. It shows the 
same simplified partial models and project framework as introduced in figure 1. 
The internal data structure, the so-called host graph serves as interactive user 
interface. The nodes, representing partial model elements and software engineer- 
ing fragments can be selected interactively and eg. queried for more information. 
The software engineering actions are grouped into several categories of pull-down 
menus, visible at the top of the window. When selected, they provide pop-up 
windows to enter the necessary parameters for their execution. After the exe- 
cution of an action, the updated host graph is automatically displayed in the 
prototype window. Individual layout settings and type visibility options support 
the selection of the desired project information. 

3 Achievements and Open Problems Using (This) 
Graph-Rewriting 

The GRIDS concepts of an integrated, formal modeling of the software engineer- 
ing aspects meet the requirements for a better support of the software engineer- 
ing task, and lead to a more comprehensive modeling of software engineering 
in general, and to enhanced software project frameworks in particular. Without 
claiming to cover the entire field of graph rewrite systems, the experiences of 
this particular, graph-based approach can be used as example for a discussion 
about the adequacy, the benefits, but also the shortcomings and the problems 
of applying a graph rewrite approach to realize automated software and method 
engineering support: 

— A probably trivial but not neglectable observation is that graphs provide very 
adequate data structures to model a large variety of software engineering 
aspects, especially if they support, like PROGRES does, such powerful con- 
cepts as eg. multiple inheritance, path expressions, and derived attributes. 

— Graphs are not only suitable as internal data repositories, but they are “vi- 
sually exploitable”, too, i.e. they can be a valuable part of an interactive 
user interface itself. This holds especially for the visualization of integrated 
software engineering models. 
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Though a formal means of specification, the declarative style of graph rewrit- 
ing can provide an intuitive way to capture real-life software engi- 
neering contexts, as eg. the pre-conditions for the manipulation of project 
information, or the definition of a complex query for information in a project 
graph. 

The necessity to formalize every detail of the dynamic behavior of an in- 
tegrated software engineering model leads to a thorough reflection and 
deeper understanding of the semantics of sound software engineer- 
ing activities, and of the various inter-dependencies between software en- 
gineering aspects by itself. 

The combination of imperative and of declarative language concepts for the 
dynamic part of PROGRES (eg. transactions) allows a very efficient spec- 
ification of the control flow(s) of applications. 

In its aim of offering a maximum of flexibility and expressive power, the 
PROGRES language has become somewhat cumbersome to handle. Of- 
ten, several different ways of specifying a certain fact exist (eg. graphical 
“restrictions” vs. textual “constraints”). Even where performance reason 
might have motivated the extension of the language by another concept (eg. 
“static paths”), this reason remains obsolete as long as other performance 
bottlenecks prevail (cf. below). The language has reached a size where a 
down-sizing of its concepts seems advantageous (maybe comparable to the 
evolution from C-f-f to Java). 

The fact that the concepts and the formal specification of GRIDS could be 
validated to a certain extent by rapid prototyping was the ultimate pre- 
requisite that industry even took notice of the idea of a formal specification 
of integrated software engineering models. 

The fact that the PROGRES rewrite system is more than a graph rewrite 
language, and that it provides a working means (actually two working means!) 
to execute large formal graph rewrite specifications can not be pointed out 
enough. This is the legitimate key factor for the success of any graph gram- 
mar engineering approach outside its own small community. 

The size of formal specifications of any serious application (like inte- 
grated software engineering modeling) quickly rises to several hundreds of 
pages - even before reaching any maturity. This is not necessarily a dis- 
advantage, if compared to the thousands of line of code that a respective 
conventional implementation - with its implicit assumptions and ambiguous 
semantics - may contain. The manageability of the formal specification 
is the crucial factor, but this quality is determined by the editors offered, 
by the executability of the specifications, and by the quality of the rapid 
prototyping of the related development system. 

The PROGRES editing tools, especially the static analyzer with its de- 
tailed error messages and hints for corrections, offer good support for brows- 
ing a specification and finding syntax errors. On the other hand, editing 
and browsing is somewhat made cumbersome due to the lack of a mod- 
ule concept in the PROGRES language (its inventors have announced the 
introduction of the module concept already for a long time). 
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— A formal specification of the size of the 3D-M would have had no value 
without prototyping, as the first, deeply erroneous attempts within GRIDS 
have shown. This regards not only the correctness of the specification itself, 
but also a prototyping of the ideas concerning the application. 

— Even for a rapid prototyping, the quality of the user interface is impor- 
tant. The layout algorithms of the PROGRES system have not been designed 
to support the use of the host graph for the interactive manipulation and 
visualization of integrated software engineering models and project frame- 
works. In most of the applications of PROGRES, the host graphs are only 
used as interna! repository for the application data. The automatic layout al- 
gorithms offered by the PROGRES system redraw the host graph completely 
after each change, making them unsuitable for supporting the ediiing-like ac- 
tivities which characterize the usage of GRIDS models. This required layout 
behavior is common to all (syntax-directed) graphical editors, in this case 
especially to those of CASE tools. 

— Despite the theoretical background of graph grammar engineering, “mun- 
dane” industrial-strength features like an efficient graph database, a 
tight integration with other development tools, and a sophisticated error 
handling are indispensable. Despite several notable improvements of the 
PROGRES system and its underlying GRAS DBMS, for GRIDS the very 
respectable database size of a few hundreds of nodes and edges (yet way 
below the size of a real-life application) seemed to pose a performance limit 
- and a direction of further improvement. 

The integration of vendor development tools, as eg. compilers, config- 
uration management, and CASE tools, into the integrated software project 
framework envisioned by GRIDS is another treshhold between a simple pro- 
totype for validation of the formal specifications as provided by PROGRES, 
and a rapid application prototype for real-life software engineering contexts. 

Summarizing the experiences made with applying a graph rewrite system for 
the development of an integrated, formal software engineering approach, PRO- 
GRES has provided a very satisfying means to formally specify and visually 
program the concepts of an interactive application, validate these concepts ad- 
equately, and last but not least communicate ideas in an unambiguous and at 
the same time intuitive way. 

By providing the rapid prototype system, PROGRES has risen far above 
other theoretical approaches in this research area. Concerning its industrial- 
strength qualities, an aim could be to reach the maturity of commercial SDL 
tools (eg. Telelogic TAU, ObjectGEODE by Verilog) that at the price of a much 
simpler formal model (SDL) are able to generate executable industrial-strength 
code for real-time applications. 
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Abstract. In order to support multiple perspectives in software devel- 
opment one needs a scheme which expresses explicitly all the views 
held by the various stakeholders like requirements engineer, software 
architect, client, user etc. The ViewPoints framework has been devel- 
oped in the past as a conceptional framework for expressing such a 
multiple perspective setting in software development projects. In this 
contribution we describe how this framework is formally described by 
distributed graph transformation and we demonstrate the applicability of 
our approach by presenting a non-trivial sample system. 



1 Introduction and Related Work 

System design is a staged process as for example the spiral model of software devel- 
opment suggests. The various development stages are visited more than once. In such 
a setting multiple participants collaborate to construct a wide range of development 
artifacts. In addition, multiple components of the software system under construction 
must interoperate effectively to achieve the desired behaviour. Especially for large 
projects various participants with different needs and even conflicting views are in- 
volved. 

In order to support such a setting the ViewPoints framework was developed which 
pursues the idea to express the various views of the stakeholders explicitly. This also 
involves to tolerate inconsistent information in related ViewPoints until it seems nec- 
essary or appropriate to check and (re)establish consistency - at least in some parts of 
the system [4]. The ViewPoints framework has been used quite successfully and has 
been documented in the literature [10]. 

Especially in industrial interdisciplinary projects where engineers of different dis- 
ciplines have to work together such a framework seems to support the actual devel- 
opment process quite well. E.g. the design information a production automation engi- 
neer creates has to be set into relation to the software architecture a software designer 
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has to produce in order to control the automation plant. This involves to loosely inte- 
grate quite different notations and processes. 

The question which is addressed here is how tool support can be constructed to ef- 
fectively represent the loosely coupled approach: some local development within a 
Viewpoint is followed by interaction with related ViewPoints via consistency checks. 
The approach of distributed graph transformation supports the idea of loosely coupled 
Viewpoints as outlined above quite naturally. It realizes the separation between the 
independent development of single local ViewPoints and the configuration and con- 
nection of a set of related ViewPoints in a structured way. 

The concepts as well as the formal definition of distributed graph transformation 
are based on the double-pushout approach to algebraic graph transformation [1] where 
basic concepts from category theory are applied. Distributed graph transformation is 
introduced formally in [1 1]. 

The Viewpoints framework was devised by A. Finkelstein et al. [5] and B. Nusei- 
beh [10] to describe complex systems. An overview of other approaches related to 
multiple perspectives in software development can be found in [6]. In [10] a general 
overview wrt inconsistency management is given. This work used a logic approach to 
describe inconsistency check actions between ViewPoints. In contrast to our approach, 
this logic approach is only used for the definition of ViewPoint check actions. Our 
approach uses graph transformation to provide the entire ViewPoints framework with 
formal underlying semantics. Thus, not only check actions are formalized, but general 
benefits of formal specification are achieved for the entire ViewPoint architecture (cf. 
conclusions). 

In the chapter Basics we will introduce briefly the basics of our approach: first an 
overview wrt. the ViewPoints framework will be given and then we will sketch how it 
can be formalized by distributed graph transformation. Based upon this we will pres- 
ent an example in the chapter A Sample System: Integration of Software Architecture 
and Performance Evaluation. 



2 Basics 

Viewpoints have been successfully used in a wide variety of domains to express dif- 
ferent views and plans of participants in a development process. A ViewPoint is de- 
fined to be a locally managed object or agent which encapsulates partial knowledge 
about the system and its domain. It contains partial knowledge of the design proc- 
ess [5]. The knowledge is specified in a particular, suitable representation scheme. An 
entire system is described by a set of related, distributable ViewPoints which are 
loosely coupled. 

A single ViewPoint consists of five slots. The style slot contains a description of 
the scheme and notation which is used to describe the knowledge of the ViewPoint. 
The domain slot defines the area of concern addressed by the ViewPoint. The specifi- 
cation slot contains the actual specification of a particular part of the system which is 
described in the notation defined in the style slot. The fourth slot is called work plan 
and encapsulates the set of actions by which the specification can be built as well as a 
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process model to guide application of these actions. Two classes of work plan actions 
are especially important: In-ViewPoint check actions and Inter- ViewPoint check ac- 
tions are used for checking consistency within a single ViewPoint or between multiple 
Viewpoints, respectively. The last slot of a ViewPoint called work record contains the 
development history in terms of the actions given in the work plan slot. 

A ViewPoint template is a kind of ViewPoint type. It is described as a ViewPoint in 
which only the style slot and the work plan slot are specified, i.e. the other slots are 
empty. When creating a new ViewPoint, the developer has the opportunity to use an 
existing ViewPoint template instead of designing the entire ViewPoint from scratch. 

The Viewpoints framework is independent from any particular development 
method and actively encourages multiple representations. Software development 
methods and techniques are defined as sets of ViewPoint templates which encapsulate 
the notations provided as well as the rules how they are used. Integration of methods 
and views is realized by such rules referring to multiple ViewPoint templates. 

While application-internal states and behaviour of a single ViewPoint can be de- 
scribed by distributed graph transformation at the local level, distributed graph trans- 
formation at the network level is well suited for coordinating a distributed ViewPoint 
configuration. In detail, the five slots of a single ViewPoint are formalized by the 
following aspects of distributed graph transformation: 

• The style slot is represented by a loeal graph transformation system (i.e. a start graph and a 
set of rules whieh represent assembly aetions). 

• The domain slot is deseribed by a loeal graph (in most eases a single node attaehed with a 
domain label suffiees). 

• The speeifieation slot is represented by a loeal graph. Please note that formalizing rules for 
transferring the speeifieation in the original notation to its graph-based formalization as well 
as for transferring the graph-based formalization baek to the original notation have to be 
speeified by the developer of the ViewPoint template. However, these rules are in most 
eases easy to develop, sinee most requirements engineering methods are based upon dia- 
grammatie notations [7]. 

• The rules and aetions of the work plan slot are formalized by distributed graph transforma- 
tion rules for aeeessing multiple ViewPoints, network graph rewrite rules for eoordinating a 
ViewPoint eonfiguration and loeal graph rewrite rules for aeeessing a single isolated View- 
Point. 

• The work reeord slot is represented by a loeal tree the edges of whieh are labeled with ap- 
plied development aetions (i.e. applied graph rewrite rules defined in the work plan). 

A more detailed presentation of our approach is given in [7]. In the next chapter we 
will now sketch an example. 



3 A Sample System: Integration of Software Architecture and 
Performance Evaluation 

In the last chapter we have introduced the ViewPoints framework and presented dis- 
tributed graph transformation as its underlying formal semantics. Now we will sketch 
a sample application of our approach where architecture and performance views are 
integrated. 
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3.1 Combining Architecture Design and Performance Analysis 

As a non-trivial application of our framework we present the work within the project 
QUAFOS (Quantitative Analysis of FOrmally specified distributed Systems). The 
goal of the QUAFOS project is the integration of the architecture description lan- 
guage 77 [9, 3] with the Queuing Specification and Description Language QSDL, an 
extension to the ITU’s Specification and Description Language SDL for non- 
functional system evaluation [2, 3]. Thus a performance model can be derived from a 
design model and maintained in parallel in order to analyze the architecture’s per- 
formance-related properties. This performance model can be simulated by the tool 
QUEST developed at the University of Essen [2]. 

Using the IT language, the functional behaviour of distributed components and their 
connections can be described as well as performance-related attributes of this archi- 
tecture. For functional as well as performance-related evaluation, we use QSDL. We 
have identified ViewPoint templates for both fl and QSDL; they are defined in detail 
in [3]. Relations between ViewPoints generated from the IT and QSDL ViewPoint 
templates are also investigated in detail in [7, 3]. 

In the next section we will introduce a simple example how the ViewPoints framework 
is used to describe a software system from multiple views. 



3.2 System Representation 

Let us consider developing a multimedia application globally distributed over the 
Internet. Figure 1 shows a snapshot of a system part after some development actions: 
it depicts a sample ViewPoint system representing a webbrowser connected to a cache 
component. Figure 1 comprises two fl ViewPoints representing distributed software 
components realizing the browser and the cache as well as a fl ViewPoint represent- 
ing the remote use relation connecting both components. The system contains two 
Viewpoints encapsulating QSDL-systems representing performance models for the 
browser component and the cache component. The QSDL representation of the remote 
use relation is modeled by three ViewPoints encapsulating QSDL-systems for the 
transport medium as well as for a browser protocol instance and a cache protocol 
instance. 

In the next section we will sketch how a sample graph representation of a local 
ViewPoint looks like. 



3.3 Local ViewPoint Representation 

In order to sketch the graph formalization of a local ViewPoint we now investigate the 
QSDL ViewPoint representing a performance model of the webbrowser in detail. 
Figure 2 shows part of its specification slot represented as a distributed graph. 

Figure 3 depicts ViewPoint trigger actions for creating new ViewPoint specifica- 
tions as well as deleting ViewPoint specifications. Figure 4 shows some sample as- 
sembly rules of the ViewPoint WebBrowser’ s work plan slot. 
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Fig. 1. This network graph represents a ViewPoint system eomprising arehiteeture and per- 
formanee related views of a sample system part. Squares representing eomponents and an 
ieonized arrow representing a remote use relation denote loeal graphs deseribing IT speeifiea- 
tions. Aeeordingly, hexagons representing QSDL systems denote loeal graphs eontaining 
QSDL speeifieations. All horizontal network edges depiet Inter- ViewPoint use relations and all 
vertieal arrows depiet Inter-ViewPoint eorrespondenee relations. 




Fig. 2. This distributed graph depiets a part of the speeifieation slot of the QSDL ViewPoint 
representing the webbrowser. In this and in the following distributed graphs body elements of 
loeal graphs are depleted in medium gray, exported graph elements are depleted in light gray 
(eonneetion request), imported graph elements are depleted in black (eonneetion eonfirm) and 
eommon parameters graph elements - i.e. graph elements whieh are imported and exported - are 
depleted in mixed light gr a ^ /black (ef Figure 6). Relations between loeal graph elements are 
not explieitly depleted, we assume a mapping between graph elements with identieal labels. 



-:reate VPspec (String ViewPointJ\ 
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Viewpoint 




NV={ViewPoint} 




J 



fctelete VPspec (String ViewPoint)^ 
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Fig. 3. These distributed rules represent ViewPoint trigger aetions for ereating and deleting 
ViewPoint speeifieations. The set NV denotes node label variables to whieh aetual values have 
to be assigned before a rule ean be applied. The label values eonsidered as rule parameters are 
denoted after the rule’s name. 





374 Michael Goedicke et al. 



fadd state (String Viewpoint, state) ^ 



L 


Viewpoint 


R 


Viewpoint 








o 



NV={ViewPoint, state} 



fadd declaration (String Viewpoint, signal listj\ 



L 


Viewpoint 




R 


Viewpoint 

















V NV={ViewPoint, signal list} 



(add next state signal output (String ViewPoint, state, signal, nextstate ) 


^ 


L 


Viewpoint 


R 


Viewpoint 


NAC 


Viewpoint 












^3 




















i^^NV={ViewPoint, state, signal, nextstate} 




^ 



(add next state signal input (String ViewPoint, state, signal, nextstate ) 


^ 


L 


Viewpoint 


R 


Viewpoint 


NAC 


Viewpoint 












^3 




















1 NV={ViewPoint, state, signal, nextstate} 







(fmport signal input (String ViewPoint, signal )f 


L 


Viewpoint 


R 


Viewpoint 
























NV={ViewPoint, state} 


J 



(export signal output (String ViewPoint, signai)\ 


L 


Viewpoint 


R 


Viewpoint 










^ signal 














y NV={ViewPoint, state} 


J 



Fig. 4. These assembly rules allow to add isolated states, signal deelarations and states eon- 
neeted with signal in- and output nodes as well as to export signal output nodes and to import 
signal input nodes. 

The Viewpoint specification shown in Figure 2 can be achieved by applying the fol- 
lowing rules (cf Figure 3 and Figure 4): 

♦ create VPspec (“WebBrowser”), 

♦ add state (“WebBrowser”, “disconnected”), 

♦ add next state signal output (“WebBrowser”; “disconnected”, “connection request”, 
“wait”), 

♦ add next state signal input (“WebBrowser”, “wait”, connection confirm”, connected”), 

♦ add declaration (“WebBrowser”, “connection request, connection confirm;”), 

♦ export signal output (“WebBrowser”, “connection request”), 

♦ import signal input (“WebBrowser”, “connection confirm”). 

The Viewpoint’s style slot comprises a graph transformation system for building 
graphical QSDL specifications. This can be modeled by an empty start graph and a set 
of assembly rules like the ones depicted in figure 4. Please refer to [3] for a detailed 
description of the QSDL ViewPoint template’s style slot. 

Figure 5 shows some sample In-ViewPoint check rules checking a simple QSDL 
representation of a non-reliable and a reliable communication connection as well as 
transforming a non-reliable into a reliable connection. Please refer to [7] for a detailed 
classification of In- and Inter-ViewPoint check actions possible by distributed graph 
rewrite rules as well as more examples both at the local level and at the network / 
reconfiguration level. 
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Fig. 5. Sample In-ViewPoint eheek aetions. 

In the next section we will present an Inter- ViewPoint relation kind especially impor- 
tant for integrating multiple perspectives: Inter- ViewPoint Correspondence Relations. 
Other kinds of Inter- ViewPoint relations are described in [7]. 



3.4 Representation of Inter-ViewPoint Correspondence Relations 

In n various views upon a software component can be specified. Within the interac- 
tion view of a client component performance requirements wrt. remote use relations 
and remote server components can be stated (e.g., a value for maximum response 
time, etc.). Accordingly, within remote use relations a communication protocol with 
performance attributes can be selected. A remote use relation is only valid for a client 
component, if its performance attributes satisfy the component’s performance re- 
quirements. As this distribution information is integrated into the IT language for 
evaluating architectures, the formalisms for noting performance attributes and re- 
quirements are based upon the QSDL sensor concept. A sensor can be placed any- 
where in the QSDL-system and collects information about system events during the 
simulation of the QSDL-system (e.g., a counter for signal throughput of a signalroute). 
The interaction view’s performance requirements are sensors extended by a compare 
operator and a concrete value (e.g., response time <10 ms). 

Bidirectional relations wrt. performance-related system properties between ft and 
QSDL Viewpoints can be identified: performance attributes grasped in the architec- 
ture design can be evaluated and also the results of the evaluation may feedback to 
new insights wrt. architecture requirements. This indicates that in addition to use rela- 
tions (which represent unidirectional relations) we need correspondence relations 
between ft and QSDL elements (which represent relations in both directions). 

Coming back to our sample architecture a performance requirement of the web 
browser component may be an actual value for the round trip time of a connection 
request signal to the cache component. Figure 6 shows a distributed graph represent- 
ing Viewpoints containing partial specifications of the IT remote use relation and the 
QSDL protocol instance. The correspondence relation is realized as a bidirectional 
graph relation, i.e. two use relations in both directions. 
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Fig. 6. This distributed graph depiets a eorrespondenee relation between the IT ViewPoint 
eneapsulating a speeifieation part of the remote use relation and the QSDL ViewPoint eneap- 
sulating a speeifieation part of the protoeol instanee from the view of the IT ViewPoint. Thus a 
n performanee attribute is assoeiated with a QSDL sensor. The attribute/sensor node, the op- 
erator node and the value node are eolored as import/export nodes. Please note that the round 
trip time node represents a performanee attribute within the IT eontext and a sensor within the 
QSDL eontext. The variable n represents a value for the round trip time. 



4 Conclusion and Further Work 

In this contribution we have shown that distributed graph transformation is a natural 
formalism to serve as underlying semantics of the ViewPoints framework. 

Our approach provides several benefits: 

♦ In order to allow for tool support and rigorous mathematieal reasoning about eorreetness / 
eomplianee a formalization of the ViewPoints framework is desirable. 

♦ Sinee most methods in requirements engineering are graphies-based a graph-based formal- 
ization is natural. Moreover, text-based formalisms have problems with representing meth- 
ods whieh make use of graphieal / struetural information intensively. 

♦ Graphs eombine intuitive usability with a formal basis. The level of formality visible to the 
user is highly sealable. The method designer ereating a ViewPoint template works at a more 
detailed level using the features of our appro aeh to its full extent while the method user (re- 
quirements engineer) operating on ViewPoint instanees works at a more abstraet and sym- 
bolie level. 

♦ Graph transformation is well suited to represent the strueture of an entire ViewPoint, thus a 
single formalism suffiees to represent all ViewPoint slots. 

♦ The distinetion of network and loeal level within distributed graph transformation supports a 
elear and natural visual impression of the system’s strueture. For example, graph transfor- 
mation at the network level ean be employed to deseribe dynamie reeonfiguration of distrib- 
uted ViewPoint eonfigurations, while graph transformation at the loeal level ean be used to 
model evolving data within single ViewPoints. 

♦ Distributed graph transformation provides a modularization eoneept for graphs and graph 
transformation systems. This makes it possible to deseribe the interfaees of a ViewPoint and 
the various Inter-ViewPoint relations in a struetured way. Further, it helps to manage the 
eomplexity of large graphs whieh will be eommon in realistie and industry-relevant projeets. 
Wrt. tool support the module eoneept also reduees the eomplexity of the graph matehing 
problem. 
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♦ Ineonsisteneies ean be defined in terms of relations that should be handled by distributed 
graph rewrite rules. Rules are very natural for this task, beeause they support an if-then style 
of formulating ineonsisteney handling. If a eertain ineonsistent situation oeeurs, then the 
following aetion should be performed. Rules do not preseribe a eertain eontrol flow, but 
their applieation order is dependent of the ViewPoint development. This suits our poliey of 
handling ineonsisteneies. 

Current work is developing tool support using this formalization as a foundation. 
In [8] we give a brief overview of the ViewPoint Tool which is based on the graph 
transformation tool AGG [12]. 
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Abstract. We discuss geometric positioning, highlighting of visited 
nodes and user defined highlighting that form the algorithm animation 
facilities in the Grrr graph rewriting programming language. The main 
purpose of animation was initially for the debugging and profiling of 
Grrr code, but recently it has been extended for the purpose of teaching 
algorithms to undergraduate students. The animation is restricted to 
graph based algorithms such as graph drawing, list manipulation or 
more traditional graph theory. The visual nature of the Grrr system al- 
lows much animation to be gained for free, with no extra user effort be- 
yond the coding of the algorithm, but we also discuss user defined an- 
imations, where custom algorithm visualisations can be explicitly de- 
fined for teaching and demonstration purposes. 



1 Introduction 

Grrr is a visual graph rewriting programming language [16,17]. It is general purpose, 
allowing the implementation of complex graph algorithms and has a visual view of 
graphs. We believe these factors make it a good system in which to code graph algo- 
rithm animation. Much of the work described here was initially designed as debugging 
tools for the initial Spider language and later Grrr language, but their wider applica- 
bility has encouraged us to extend the ideas and develop more general animation tech- 
niques. 

We describe three algorithm animation techniques in this paper. The first technique is 
that of user defined emphasis, which has always been in our system in a limited fash- 
ion, as the programmer can use built in transformations to highlight chosen nodes and 
subgraphs of the host graph. Second, tools for animation have resulted from the recent 
graph drawing variation, Grrr, which allows nodes to be positioned at a geometric 
point. The movement of the node on the screen can be followed for better under- 
standing of the progress of the algorithm. Third, we have recently implemented the 
automatic highlighting of subgraphs that have been matched in the host graph. This 
means that sections of the host graph that have been visited are shown. 
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The animation of node movement and the highlighting of matched subgraphs come 
for 'free', in that they can be used without any user input except specifying that anima- 
tion is required. The user highlighting is for custom animation and requires a pro- 
grammer to ensure that the correct section of the host graph is highlighted during the 
progress of the algorithm. The node movement animation can also be used to produce 
custom animation by the programmer specifying the preferred location of nodes for 
best comprehension. The three techniques given here can be combined as desired. 

We do not claim any originality for our animation methods, but to our knowledge 
this is the first time a graph rewriting language has been associated with algorithm 
animation. We believe our system is very suited to animation because of its visual 
emphasis, and because the design has resulted in semantics which are useful for ani- 
mation. The visual nature of the programs in Grrr means there is no 'impedance mis- 
match' that might occur when defining a textual program for visual execution. 

Grrr allows the execution of rewriting to be viewed on the screen as it happens. 
This step view of rewriting is an important part of the animation process as it allows 
highlights and movement to be shown as the algorithm progresses. The user can also 
step through a program manually, taking their own time to observe the execution. 

Another feature of Grrr that helps with algorithm animation is the ability to hide 
subsections of the host graph so that only the data structures that are being manipu- 
lated or which are relevant to the user can be seen, and so the housekeeping under- 
neath can be hidden to avoid confusion. 

Algorithm animation in Grrr has two main roles: firstly, the original intended role 
was to aid the debugging of programs written in Grrr, so that graph match highlighting 
can indicate where the rewrites are operating, or showing node movement can indicate 
the way nodes are manipulated in graph drawing. The second role is that of an educa- 
tional nature to visualise algorithms in order to teach them, the standard motivation 
behind algorithm animation systems. 

The algorithm animation in Grrr is entirely restricted to graph highlighting and 
movement, whereas many dedicated animation systems have facilities for more ab- 
stract representation, using extra graphics and shading to aid visualisation [2,3,12,20]. 
This type of animation is not easy to define with the graph rewriting described in this 
paper. However, we note that several systems allow similar types of animation to the 
graph oriented approach provided in Grrr, e.g. [6,9,10,13], so we feel we are justified 
in restricting our system. We must note that most animation systems are primarily 
designed for teaching algorithms, but studies have thrown doubt over the usefulness of 
algorithm animation as a learning tool [8,19]. 



2 Programming with Graph Rewrites 

Grrr is a graph rewriting programming language. It computes by rewriting a host 
graph according to user defined transformations. A key advantage to this approach is 
the combination of computational completeness and visual view of both the graph 
being rewritten and the transformations that rewrite the graph. This combination, 
along with features such as serial rewriting and serial trigger initiation make Grrr a 
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potentially useful system for inherently visual, but complex tasks such as graph draw- 
ing and algorithm animation. 

Previous graph rewriting languages include GOOD [15], Progres [18], 
Dactl /MONSTR [7,1] and A-grammar programming [1 1], each of which has a unique 
interpretation of programming with graph rewrites. These graph rewriting languages 
vary in several important aspects: the type of host graph that is to be rewritten may be 
any graph, or it may be restricted by disallowing duplicate nodes or arcs, or indeed 
may have some underlying hierarchical structure; the graph may be rewritten in a 
serial or parallel manner; the transformations may be initiated in a number of ways; 
the transformations may be applied in serial or parallel; and there are alternative ways 
that the user can specify the transformations. Typically, the systems have general 
programming features, but are aimed at specific applications. 

Grrr is a development of the Spider graph rewriting programming language. Spider 
is a prototype system for database programming. Modified, it forms the basis of Grrr, 
a general purpose programming language, which we are using to explore the notions 
of visual graph drawing. Graph drawing has been seen in graph rewriting systems 
previously [4,21]. Our current project is attempting to demonstrate that programming 
a wide range of graph drawing algorithms is feasible in a graph rewriting visual lan- 
guage. To achieve this we are in the process of producing hierarchical, force directed 
and planar graph drawing algorithms in Grrr. We note that Grrr is still under devel- 
opment and future changes both to the semantics and implementation are likely. 

Grrr features serial trigger initiation in a two graph rewrite specification method, 
the difference between the LHS and RHS in a rewrite indicate the changes to be made 
to the host graph. The rewrites are contained in transformations. When a transforma- 
tion is called the LHS graphs are tested against the host graph in the top down method 
until one matches, that is they are tested in order of presentation in the transformation. 
We use this approach rather than alternatives such as 'best fif (i.e. classifying LHS by 
how specific they are) because of its success in analogous textual rule based systems 
such as logic and functional languages. There is also the problem of interpreting best 
fit in a graph based system. 

The transformations are called by trigger nodes (shown with a rectangular shape) in 
the host graph, and only one trigger is initiated at a time. This is achieved by a newest 
first execution order for the triggers in the graph. Only one LHS graph is matched at a 
time, and the rewriting occurs in a serial manner using a deterministic subgraph 
matching strategy that relies on the nodes and arcs in the graph having an internal 
ordering. The serial nature of Grrr aids algorithm animation as a parallel rewriting or 
trigger initiation strategy could hide that the progress of the algorithm. 

The data graph (that is the part of the host graph that holds application data, usually 
shown with round nodes) can be distinguished from the part of the graph that holds 
associated information (that is, information derived from the data graph and informa- 
tion concerning execution, usually shown with oval nodes). A node type specified in a 
rewrite will only match with that node type in the host graph. 

Grrr allows arbitrary graphs to rewritten. To avoid ambiguity when deleting or 
adding primitives, duplicate labels which appear in the LHS or RHS must be identi- 
fied by the user. The identifier is an integer superscripted to the node label. 
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Current modifications to the rewriting process include attractor nodes, negatives, 
once only nodes and single match rewrites. For example. Figure 2 shows a transfor- 
mation with a single match rewrite, indicated by a shaded background. The LHS of 
this rewrite will match once and only once when the associated trigger node is called. 
After matching, the single match rewrite will be ignored when further calls of the 
particular trigger node are made. 

Figure 2 also shows the use of negative primitives in LHS graphs. Here, the second 
rewrite contains a negative node and arc, indicated by the primitives having thick 
outlines (not to be confused with the highlighting of nodes in the host graph). For this 
LHS to match, the positive part of the graph must match, and there must be no corre- 
sponding match of all the LHS including the negatives. 

Figure 1 illustrates the use of attractor nodes, with the RHS of the second rewrite 
having the attractor node 'Minus', indicated by a shaded background. Attractor nodes 
pick up any dangling arcs after a rewrite has been performed. Normally such dangling 
arcs are deleted from the host graph. 

Not shown in the examples is the use of once only nodes in LHS graphs. Here a 
node can be specified to be a once only node, and such a node will match no more 
than once with each corresponding node in the host graph. This allows for simple 
iteration through a graph. 

To perform mathematical calculations and to express geometric operations in Grrr, 
there are many built in transformations. Many of the built ins are atomic, however 
others have been added for efficiency reasons. 

Often the progress of Grrr programs is expressed in terms of number of steps. Each 
step is an execution of a trigger node, and can be considered much like the execution 
of a single instruction in a traditional textual programming language. 



3 Illustrations of Use 

Here we give all or part of three programs to illustrate the varied nature of the anima- 
tion in Grrr. The transformations that make up the programs contain highlights in 
order to distinguish between nodes with different semantic meanings, which should 
not be confused with the highlights shown in the host graphs, which are purely for 
animation purposes. 

There are three ways of including animation in Grrr programs: by indicating via a 
menu option that the matched part of the host graph should be highlighted, by indi- 
cating via a menu option that any node movement should be animated, and by adding 
a built in trigger to highlight a chosen node. 

The automatic highlighting of matched subgraphs is a useful tool in debugging Grrr 
programs as it indicates that the desired part of the host graph has been matched by a 
LHS graph. However, it can also be used for other sorts of animation by indicating 
which part of the host graph has been visited as shown in the shortest path example. 
Section 3.2. When nodes and arcs are highlighted, the line thickness increase and the 
colour changes from black to purple. There is an element of arbitrariness to this, and 
the specification of highlights can be changed. 
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Animation of node movement is a result of recent work in adding geometric trig- 
gers to Grrr, and as with highlighting matched subgraphs it is a feature that can be 
used for either debugging or within a custom animation. When producing graph 
drawing algorithms, such as the force directed algorithm given in Section 3.1, it is 
very useful to observe the process of the algorithm for evaluating the success of the 
approach and confirming the correctness of the implementation. However, in terms of 
animation, the bubble sorting example given in Section 3.3 shows how geometric 
operations can be added to a purely graph theoretic algorithm in order to clarify the 
approach. In terms of visualisation, nodes that are moved are shown changing position 
on the screen. 

The notion of adding triggers that change the appearance of nodes is entirely cus- 
tom animation directed. The built in triggers include those to simply highlight the 
nodes. However there are more flexible commands to change the colour of nodes, and 
clear all highlights in the host graph. 



3.1 Force Directed Graph Drawing 

The animation of graph drawing algorithms requires no extra work by the user. The 
movement of nodes from one position to another can be shown by selecting a menu 
option. The fine tuning of algorithms is made easier because the immediate effect of 
altering parameters, or other changes to the drawing process can be seen. Also, the 
way poor drawings occur can be observed, so allowing changes to cope with situations 
such as subgraphs getting in to local minima or rogue nodes being misplaced. 

As an example, we show part of a force directed graph drawing algorithm, it is dif- 
ficult to show the actual animation in a research paper, but we hope that it is clear that 
changing various aspects of this algorithm are quite easy, even when the algorithm has 
been partially executed. The parameters for node movement can be altered both in the 
transformation definition and in the host graph. The function used for deriving the 
amount of node movement can also be altered from the very simple calculation given 
here into a more complex formula that may have a beneficial effect on layout. 

The method treats arcs as springs between nodes, attracting them together, whist 
unconnected nodes are repelled. The algorithm presented here first iterates through the 
connected node pairs, bringing them closer, and then it iterates through all node pairs 
separating the nodes which are within a set distance of each other. This process is 
repeated a number of times, with the distance that the nodes are attracted reducing on 
each iteration. Our version of the force directed approach is a simple variant on those 
described in [5,14]. 

The two built in transformations that move nodes are 'Closer' and 'Further', which 
attract and repel node pairs respectively. They are used in the two transformations 
from the program, shown in Figures 1 and 2. The start host graph is shown in Fig. 3 
and the final host graph is shown in Fig. 4. 
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Fig. 1. The transformation 'TestDisf. This brings connected nodes closer together. The 

distance they are moved together is greater when the nodes start further apart. The built in 
'Closer' transformation moves both argument nodes an identical distance towards each other. 
The distance is calculated from the distance they are apart (found using the 'NodeSeparation' 
built in trigger) and the number of iterations that has taken place. The calculations are per- 
formed by the 'Divide' and 'Minus' built ins. The user defined transformation 'MinDistance' 
simply returns a constant number 90 in this implementation which can be used by 'Minus' 
which will in turn return a number that can then be used by 'Divide'. 







Graph Algorithm Animation with Grrr 385 




Fig. 2. The transformation 'Separate'. This finds the nodes that are eloser than a set 

distanee from a node and then moves them apart a eonstant distanee. 'BBox' is a built in trans- 
formation that returns the nodes within the speeified reetangle, 'OverlapBox' is also built in and 
returns the reetangle eontaining the speeified nodes (or single node as in this ease). The nodes 
within the reetangle are then separated with the built in 'Further' transformation that moves both 
argument nodes an identieal distanee from eaeh other 
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Fig. 3. At the start of execution. There will be 5 iterations of first closing the nodes 

connected by arcs and then separating nodes that are too close 




Fig. 4. At the end of execution 



3.2 Shortest Path 

This algorithm makes use of the highlighting to indicate the success of a graph search. 
In this case we are finding a shortest path between two nodes in an unweighted graph, 
so a simple depth first search will suffice. This is a version of an algorithm given 
in [17], hence we only show the major alteration. Fig. 5 which changes the algorithm 
by maintaining the structure of the graph and highlighting the path found, rather than 
deleting the nodes not participating in the path. This algorithm is a good example of 
using the match highlighting feature for animation purposes. Fig. 6 shows the host 
graph at the start of execution. Fig. 7 shows the host graph at the end of execution. 
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The search method can be clearly seen when the graph is stepped through in a slow 
manner, as only the matched nodes are visible, the path is added after the search has 
found the 'arg2' node. 





Fig. 6. The host graph at the start of exeeution. The program is searehing for a path 

between the two round nodes eonneeted to the trigger by 'argT and 'arg2' ares. The seareh is 
from the 'argT node to the 'arg2' node 
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Fig. 7. The host graph at the end of exeeution. The blaek nodes indieate the nodes 

visited whieh are not in the path, the thiek lined nodes indieate the nodes in the path, and the 
unehanged nodes are those that have not been visited. The algorithm finds only one shortest 
path of possible eandidates, henee the path given was ehosen over the alternative (using the 
blaek nodes) by the Grrr node ordering system whieh ensures the matehing proeess is determi- 
nistie. The animation uses different eolours when appearing on the sereen, but we are limited to 
a blaek and white display for this paper 



3.3 Bubble Sort 

Here we give an example of a purely custom visualisation task. This is the sort of 
algorithm animation that is a useful teaching aid. Sorting is not an ideal task to per- 
form with Grrr, as the relative lack of complexity of the data structure that is manipu- 
lated makes it less suited to our form of graph rewriting. The housekeeping concerned 
with list iteration, for example, dealing with all cases of nodes with or without prede- 
cessors or successors, means that transformations often have many rewrites, one for 
each case. We show all the transformations in this program to indicate some of the 
difficulties of producing this sort of custom visualisation task. 

The program sorts a list represented by a set of nodes connected by arcs. Bubble 
sorting performs several iterations through a list, swapping the position of neighbour- 
ing list members that are in the wrong position until an iteration swaps no more mem- 
bers. The algorithm animation here is that of indicating the pair of nodes that are being 
tested and demonstrating swaps via the physical moving of the positions of swapped 
nodes. Both node highlighting and swapping is defined explicitly in the program. 

The full bubble sort program is shown in Fig. 8, Fig. 9 and Fig. 10. Some illustra- 
tive stages in execution are shown in Fig. 1 1, Fig. 12 and Fig. 13. 
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Fig. 8. The transformation 'BubbleSorf. This is the top level transformation in the 

program and performs the tasks of iterating through the list, ealling the transformation 'Swap', 
whieh swaps the pairs and 'Highlight? air' whieh indieates whieh pair of nodes are being 
swapped 
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Fig. 9. The transformation 'Swap'. It has to deal with three eases: where is a node to 

either side of the swapped pair, where there are nodes at either end, and where the pair is alone. 
It ealls two built in transformations: 'NodeSeparation' whieh finds the distanee between the 
nodes, and 'Closer' whieh moves eaeh node eloser to the other by that distanee, so animating 
the swap by node movement. 
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Fig. 10. The transformation 'HighlightPair'. The first rewrite elears the eurrent highlight 
and is not ealled again beeause it is onee only. The seeond rewrite highlights the indieated two 
nodes and removes the 'HighlightPair' trigger. The final rewrite is present to deal with the ease 
that the chosen node does not have a following node in the list. Here the trigger is deleted with 
nothing highlighted 





— <T) — <T) kX) 




BubbleSort 





Fig. 11. At the start of the 'BubbleSort' program 




Fig. 12. The host graph at step 76 in the rewriting process. This is the middle of the 
second iteration through the list. The next few steps will exchange both the graph theoretic and 
geometric positions of two highlighted nodes. The 'swapped' node indicates that a swap has 
already been performed on this iteration. The 'HighlightP airs' trigger will be executed after the 
swap and highlight the next pair to be tested 




Fig. 13. 



The host graph after the program has finished on step 166. The list is sorted 
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4 Conclusions and Further Work 

The animation techniques presented here are not new, however the graph rewriting 
method used to create them is novel and the animation is easy to achieve, as in many 
cases it requires no extra effort from the programmer. We see this as an application 
area that plays on the strengths of graph rewriting programming languages, as they are 
the only current systems which combine a visual view of graph data structures and 
computational completeness, so potentially allowing all possible graph based algo- 
rithms to be animated. Indeed, such animation is a great debugging aid when pro- 
gramming with Grrr. 

The algorithm animation capabilities presented here fall into two main visualisation 
techniques: animating node movement, and highlighting visited or chosen subgraphs. 
The methods used for producing animations can be partitioned into those that can be 
used on existing algorithms, such as showing the node movement in graph drawing, or 
displaying visited nodes and those which are defined by the user, such as placing 
nodes for illustration and selecting specific nodes to be highlighted. The techniques 
described here can be combined as wished. 

There are many possible areas of future work concerned with improving the usabil- 
ity of this programming language for the task of algorithm animation. The first im- 
portant requirement for development of graph based algorithms is graph editing. The 
current Grrr editor is proving tricky to use for the high volume graph production re- 
quired for animation. Further flexibility in cutting and pasting, and general user inter- 
face improvements are required. 

The definition of node movement and highlighting in transformations is currently 
explicit, that is, the nodes are moved and highlighted by built in transformations. One 
can envisage an implicit method for defining node movement, where a difference in 
position of a node from the LHS to a RHS would mean the node was moved in the 
host graph. Implicit highlighting is also possible, where a node highlight in a RHS 
would be reflected by the corresponding node being highlighted in the host graph. 

Node movement could be made easier by allowing a movement path to be defined 
by an arc. The node might follow the bends in the arc so as to produce more sophisti- 
cated user defined animation. 

Because many of the built in transformations do not change the structure of the 
graph, it can be difficult at times to ensure that they are called in the right order. For 
example, the current position of a node should be found before that node is moved. 
Hence we are considering adding some method for specifying the order of trigger 
node execution in RHS graphs. 
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Abstract. Cpfg is a program for simulating and visualizing plant devel- 
opment, based on the theory of L-systems. A special-purpose program- 
ming language, used to specify plant models, is an essential feature of 
cpfg. We review postulates of L-system theory that have influenced the 
design of this language. We then present the main constructs of this 
language, and evaluate it from a user^s perspective. 



1 Introduction 

L- systems were introduced in 1968 as a mathematical theory of multi-cellular 
development [19, 20], but soon afterwards they began to be used as a founda- 
tion for plant modeling and simulation systems [30]. The first L-system- based 
plant modeling program, CELIA (an acronym for the CEllular Linear Iterative 
Array simulator) was created by Baker and Herman in the early seventies [2], 
and was improved until the mid eighties. CELIA was followed by pfg (plant and 
fractal generator) [10, 33], and its successor, cpfg (pfg with continuous parame- 
ters) [11, 26, 27]. Existing and prospective applications of cpfg include computer 
animation and landscape design, research and education in botany and ecology, 
and decision support in agriculture, horticulture, and forestry [38]. The synergy 
between scientific and visual objectives of plant modeling has been diseased 
in [31]. 

A distinctive feature of cpfg is its modeling language, which makes it possible 
for the user to easily specify and modify a wide range of plant models^. The 
cpfg modeling language [26] extends notions of L-system theory [12, 39] with 
the following concepts: 

1. parameters associated with L-system symbols to express quantitative at- 
tributes of the modeled structures [11, 37], 

2. programming constructs borrowed from other programming languages: lo- 
cal and global variables, arrays, input-output functions, and flow control 
statements [11, 15, 34], 

3. modeling constructs without obvious counterparts in other programming lan- 
guages: decomposition and interpretation rules [27] and sub-L-systems [11], 

^ Related languages have been also implemented in other modeling programs based 
on L-systems, for example GROGRA fl7l and World Builder fll. 

1 Jr I J I J 
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4. programming constructs needed to capture plant responses to environmental 
factors and to simulate bi-directional interaction between plants and their 
environment [27, 28, 35] ^ 

5. graphical interpretation of L-systems based on turtle geometry [37]. 

In addition, the language supports graphical modeling and rendering techniques 
needed for realistic visualization of the models: predefined bicubic surfaces for 
representing plant organs of a given shape [10, 37], developmental bicubic sur- 
faces for animating organ development [11], generalized cylinders for modeling 
stems with arbitrary cross-sections, and texture-mapped surfaces needed for 
more realistic rendering [27]. 

In this paper we present the design of the cpfg modeling language. We focus 
on items 2 and 3 of the above list, which have not been previously published 
outside of dissertations and theses. We begin by reviewing the elements of L- 
system theory that have influenced the design of the cpfg language (Section 2) . 
On this basis, we present its essential constructs (Section 3). We conclude by 
summarizing our experience with the cpfg language, and discussing areas that 
require further research (Section 4). 

2 L-systems as a plant modeling paradigm 

2.1 A plant as a metapopulation 

A basic postulate of L-system theory [24, 30] is that a plant can be considered as 
a population (set) of discrete components, such as apices, internodes, leaves, and 
flowers. In simpler multicellular organisms, for example algae, these components 
can be identified with individual cells [19]. It is assumed that the set of organ 
types in organisms of a given species is finite, irrespective of the organism size. 
The type of an organ is represented by a symbol. Since the set of organ types is 
finite, the set (alphabet) V of symbols is finite as well. 

2.2 Branching architecture of plants 

L-system models operate at the level of plant architecture, which means that 
components of a model are assumed to be connected into a branching structure. 
From the graph-theoretic point of view, this structure can be described as an 
axial tree. Organs are represented by edges of this tree, and identified by symbols 
from alphabet V used as edge labels. 

Formally, an axial tree is a special type of rooted tree [37]. At each of its 
nodes we distinguish at most one outgoing edge called the straight segment. 
All remaining edges are called lateral or side segments. Within an axial tree, a 
sequence of edges is called an axis if: (i) the first edge in the sequence originates 
at the root of the tree or as a lateral segment at some node, (ii) each subsequent 
edge is a straight segment, and (iii) the last edge is not followed by any straight 
segment. The beginning and ending node of an axis are called its base and tip, 
respectively. An axis with all its descendants (edges that can be reached from 
the nodes within this axis) is called a branch. 
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2.3 Development as a parallel rewriting process 

According to the theory of L-systems^ plant development can be captured by 
a set of productions that describe the fate of plant components over discrete 
time intervals, beginning with an initial structure (the axiom) [24, 30]. In graph- 
theoretic terms, a production replaces an edge of an axial tree, called the pro- 
duction predecessor, by an axial subtree called the successor. In the simplest 
case of development controlled by lineage [23, 32, 37], the suitable production is 
identified by the label of its predecessor, which must match the label of the re- 
placed edge in the tree. The successor is embedded into the resulting tree in such 
a manner that the starting node of the predecessor edge is mapped to the base 
of the successor axis, and the end node of the predecessor edge is mapped to the 
tip of the successor axis. Productions replace all modules of the predecessor tree 
in parallel derivation steps. This parallelism is intended to reflect simultaneous 
progress of time in all parts of the modeled organism. For example. Figure 1 
presents the development of a hypothetical compound leaf with two segment 
types: the apices (thin lines) and the internodes (thick lines). The development 
begins with a single apical segment and is modeled using two productions. 




Fig. 1. Productions of a sample L-systems and a sequence of derived structures 



2.4 The bracketed string notation 

Within the theory of L-systems, axial trees are commonly specified as strings of 
symbols (words) over the alphabet F U{[, ]}, where the bracket symbols [, ] do not 
belong to the set of component symbols F. A sequence of labels of consecutive 
straight segments represents an axis. A matching pair of brackets [. . .] encloses 
a branch. For example, let A and B be symbols from alphabet F, and w be 
a properly bracketed string with symbols from F. The notation . . . A[w]B . . , 
means that B is the straight segment that follows A in the axis . . . AB . . ., and 
w is the lateral branch originating at the end node of A. 
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Every axial tree can be described by a well-formed bracketed string, and ev- 
ery well- formed bracketed string describes an axial tree (cf. [36]). Consequently, 
bracketed string notation provides a convenient means for specifying an L-system 
(the axiom and the set of productions) in a textual form. For example, the L- 
system shown in Figure 1 can be written as 

A 

A^/[+A]M]M (1) 

I 

where A denotes an apex and I denotes an internode segment. Symbols + and 
— indicate the directions of branching (to the left and to the right, respectively) 
according to the turtle geometry paradigm [37]. 

2.5 Parametric L-sy stems 

It is often necessary to characterize components of the modeled structure using 
continuously valued parameters. For example, parameter values may represent 
geometrical aspects of components, such as the length and diameter of an intern- 
ode, relationships between components, such as the magnitude of a branching 
angle, and physiological attributes of a component, such as its water content or 
concentration of photosynthates. The association of numerical parameters with 
L-system symbols was first described by Lindenmayer [21]. The cpfg language is 
based on a formalization of this concept called parametric L-systems [11, 37]. 

Parametric L-systems operate on bracketed parametric strings, that is strings 
of modules defined as symbols with the associated numerical (real- valued) pa- 
rameters. The actual parameters that appear in parametric strings have their 
counterparts in formal parameters that may appear in productions. For example, 
a production that doubles length x of an internode I in every derivation step 
may be written as 

I{x)^I{2^x). ( 2 ) 

The above concepts serve as the foundation of the cpfg modeling language, 
described next. 

3 The cpfg modeling language 

3.1 Example of a simple cpfg model 

A simple cpfg model can be specified by complementing the axiom (listed after 
the keyword axiom) and productions of a parametric L-system with three state- 
ments: Isystem: labels endlsystem, and derivation length: length The first 
two statements delimit the L-system and assign a unique label to it; this makes it 
possible to divide more complex models into several sub- L-systems (Section 3.5). 
The remaining statement specifies the required derivation length. For example, a 
cpfg model describing the development of the compound leaf shown in Figure 1 
may be written as follows: 
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Model 1 

idefine gro¥th_rate 2 
Isystem: 1 

derivation length: 5 
axiom: A 

A — > I(1)[+A][-A]I(1)A 
I(x) — > I (gro¥th_rate * x) 
endl system 

This model combines the basic structure of L-system (1) with the parametric 
specification of internode elongation by production (2). The use of a parameter 
makes it possible to avoid the exponential increase of the number of symbols 
representing a growing internode^ and makes it easy to modify the internode 
growth rate. This is emphasized by the preprocessed #def ine statement, which 
assigns a value to the growthnrate constant. 

3.2 Productions, variables, and statement blocks 

Model 1 incorporates the simplest type of productions used in cpfg models, the 
deterministic context-free productions. Cpfg also supports context-sensitive pro- 
ductions [19, 23], which make it possible to capture a wide range of interactions 
between components of a branching structure [32]. Overall, cpfg productions 
have the following syntax [11]: 

Ic < pred > rc :{ a} cond { /? } — > succ : prob. (3) 

The terms Ic^ pred^ rc^ and succ are parametric strings denoting the left context, 
the strict predecessor, the right context, and the successor of the production. The 
strict predecessor must not be the empty string. It usually consists of a single 
module, but may also include several modules. The strict predecessor and the 
successor, separated by an arrow, are the only mandatory terms of a production; 
all other terms and the separators related to them can be omitted. 

Cpfg productions can be applied in a deterministic or stochastic fashion. In 
the deterministic case, productions in the model are scanned in the order in 
which they appear, and the first applicable production is used. In contrast, in 
the stochastic case (indicated by the presence of prob expressions at the end 
of production specifications) all applicable productions are found, and one of 
them is selected at random. Specifically [32], if pi^ ^pi^^ . . . are the applicable 
productions, and tt^ , , . . . , > 0 are the corresponding values of their prob 

expressions, a production pi^ will be selected with the probability 

= , k=l,2,...,m. (4) 

The condition cond is a logical expression that guards the application of the 
production (the production may only be applied when cond evaluates to true). 
The term a is a block of statements that is executed before the evaluation of 
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the condition cond. Similarly^ /? is a block of statements that is executed after 
the condition evaluation, if the result is true. The condition and the statement 
blocks a and /? are expressed using a syntax based on the C programming lan- 
guage [16]. Arithmetic expressions may also be included in the successor specifi- 
cation, where they define the parameter values assigned to the successor modules 
as discussed in Section 2.5. The logical and arithmetic expressions may include 
the usual logical operators (Boolean operations AND, OR, NOT, and compar- 
isons of numerical values), arithmetic operators (addition, subtraction, multipli- 
cation, division, remainder from integer division, and raising to a power), and 
predefined functions. The supported functions include a selection of standard 
mathematical functions {e.g. sin, atan, exp, floor, sign) and functions that 
return pseudo-random values with given distributions (uniform, normal, beta). 
In addition, the statement blocks may include assignments, if and if ... else 
statements, while and do ... while loops, and C-like input-output functions 
(printf, fopen, fclose, fprintf, fscanf). 

The expressions and statements included in productions operate on formal 
arguments listed in the production predecessor and context, local variables (in- 
troduced in blocks a or /?, and with a scope limited to individual productions) 
and global variables, which can be accessed by all productions. A global variable 
must be initialized in one of the following statement blocks: 

Start: {statements} (executed at the beginning of the simulation). 

End: {statements} (executed at the end of the simulation), 

StartEach: {statements} (executed at the beginning of each step), 
EndEach: {statements} (executed at the end of each step). 

The statement blocks may be specified in any order between the Isystem and 
axiom statements. In addition to global variables, the cpfg language also supports 
globally defined arrays. 

The use of these constructs is illustrated by the following model: 

Model 2 

idefine dt 0.03 /* time increment */ 

tdefine t_max 1.0 /* maximum age of the apex */ 

Isystem: 1 

Start: {fp = fopenC’ statistics" , ^tep=0;} 

StartEach: {step=step+l ; n=0;} 

EndEach: {fprintf (fp, "Step number of apices \n" , step, n);} 
End: {fclose(fp) ;} 
derivat ion length : 200 
axiom: A(0) 

/* pi: advance apex age until t_max */ 

A(t) : {t_new = t+dt;} t_new<t_max {n=n+l;} — > A(t_new) 

/* p2: create new organs when t_max has been reached */ 

A(t) : {t_new = t+dt;} t_new>=t_max 

{t_init = t_new-t_max; n=n+3;} — > 

I (t_init) [+A(t_init)] [-A(t_init)] I(t_init) A(t_init) 




An L-Sy Stem-Based Plant Modeling Language 



401 



/* p3: advance internode age */ 

I(t) — > I(t+dt) 
endl system 

The Start statements open the output file statistics and initialize the 
global variable step that will count derivation steps. The StartEach statements 
increment the step variable at the beginning of each derivation step^ and set to 
zero a global variable n, used to count the total number of apices in the structure. 
The resulting number is reported (appended to the file statistics) at the end 
of each derivation step by the EndEach statement. Finally^ the End statement 
closes the file statistics at the end of simulation. 

In contrast to Model 1, in which the time increment associated with a deriva- 
tion step was an inherent feature of the models in Model 2 the time increment 
is controlled by a user-definable constant dt. According to production pl^ the 
age of an apex advances by the constant dt until the maximum age value t joaax 
has been reached. At that time the apex divides into several new modules, as 
described by production p2. Since time advances by fixed increments dt, the age 
t jie¥ may actually exceed the maximum t joaax, which is why the newly created 
modules are assigned the initial age t_init = t mew - t joaax by production p2. 

The model makes use of global variables to collect the output data that 
quantify the simulation results. Productions pi and p2 increment the global 
variable n by the number of created apices (1 and 3, respectively). Consequently, 
variable n represents the total number of apices at the end of each derivation step. 
The sequence of these values constitutes the numerical output of the simulation. 

Strictly speaking, the use of global variables that can be changed by pro- 
ductions is inconsistent with the parallel application of productions postulated 
by the definition of L-systems. The reason is that different productions may at- 
tempt to assign different values to the same variable at the same time [34]. We 
address this problem at a conceptual level by assuming that the blocks of state- 
ments associated with productions are executed as indivisible operations in an 
arbitrary order. Thus, we use interleaving composition as a model of parallelism 
when evaluating these blocks [25] . In practice, cpfg is implemented as a sequen- 
tial program that applies productions one at a time (cf. [33, Appendix A]), thus 
the assumption of indivisible execution of the statement blocks is automatically 
satisfied. 



3.3 Decomposition rules 

As noted in Section 2.3, an L-system production such as A ^ BC states that 
module A produces modules B and C over time. In the context of plant modeling 
it is also convenient to have a construct for expressing another concept, namely 
that a compound module A consists of (or can be decomposed into) modules 
B and C. In cpfg, such structural relations are expressed using decomposition 
rules. They have the same syntax as context-free productions described in the 
previous section, but are preceded by the keyword decomposition in the cpfg 




402 Przemyslaw Prusinkiewicz et al. 

model. When decomposition rules are written outside a cpfg program, they are 
indicated by symbol -d> used instead of — > in the production specification. 

A derivation step in a model with decomposition consists of the application 
of productions, followed immediately by the application of decomposition rules. 
In general, the decomposition rules can be applied recursively, as long as there 
are modules that can be further decomposed. This possibility is illustrated by 
the following variant of Model 2: 

Model 3 

tdefine dt 1.3 /* time increment */ 
idefine t_max 1.0 
Isystem: 1 

der ivat ion length : 0 
axiom: A(0) 

A(t) — > A(t+dt) 

I(t) — > I(t+dt) 
decomposition 

A(t) : t>=t_max {t_init = t-t_max;} — > M(t_init ) A(t_init) 

M(t) -> I(t) [+A(t)] [-A(t)] I(t) 
endlsystem 

In Model 3, productions simply increment the age of apices A and internodes I 
in each derivation step. The fate of an apex that has reached its maximum age 
tjoaax is expressed by the decomposition rules. The first rule states that such 
an apex will produce a compound structure M (a metamer [3]) and a younger 
instance of the apex A. The second rule specifies that M consists of two internode 
segments I, which support a pair of lateral apices A at their join point. Thus, 
the description of the periodic activity of the apex has been separated from the 
detailed description of the produced structure M. The first derivation step in 
Model 3 is illustrated in Figure 2. 



A(0) 




1(0.3) [ A(0.3) ] [ A(0.3) ] 1(0.3) A(0.3) 

Fig. 2. Ill ustration of a derivation step with decomposition. The application of the 
production (thick line) is followed by the application of decomposition rules (thin lines). 



If the time increment dt is greater than t joaax, the initial age t_init of the 
newly produced apices may still be greater than t joaax, resulting in a recursive 
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application of the decomposition rules. This recursion will eventually end, be- 
cause the first decomposition rule reduces the age of new modules by t joaax with 
respect to the age of their parents. Thus, in addition to clarifying model specifi- 
cation, decomposition rules make it possible to improve the models. Specifically, 
Model 2 operates correctly only if dt < t joaax, whereas Model 3 does not require 
this assumption. The formal basis for advancing time by arbitrarily large steps 
using recursively applied productions was introduced in [37] (timed L-systems). 

The logical distinction between the relations “produced over time'’ and “be- 
ing part- oF, which underlies the distinction between productions and decom- 
position rules in cpfg models, was formalized by Woodger and Tarski [41], and 
reviewed by Lindenmayer [22]. The multi-level specification of plants, implicit in 
the distinction between compound modules and their constituents, was analyzed 
by Godin and Caraglio [8]. 

3.4 Interpretation rules 

Using the terminology of [37], Models 1-3 are schematic: they only specify the 
topology of the developing structures. Interpretation rules provide a mechanism 
for complementing schematic models with the geometric information needed 
for visualization purposes. The interpretation rules do not affect the sequence 
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^ 1^2 ^ 1^3 ^ 
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Fig. 3. Generation of a developmental sequence using a cpfg model with interpretation 

p 

rules. The progression of strings /ro? Mi? M2, • • • results from the derivation steps 

h 

defined by productions and decomposition rules. The interpretation rules map 
strings in into the final strings Pi that contain the graphical information. 



of strings /io? • • • derived by productions and decomposition rules, but 
make it possible to replace modules in the derived strings by other modules or 
sequences of modules (Figure 3) , which may have a predefined graphical inter- 
pretation. This concept was first applied to L-system-based plant modeling by 
Kurth [17].) 

In cpfg, the interpretation rules are specified using the same syntax as context- 
free productions and decomposition rules, following the keyword homomorphism. 
Considered in isolation, they are indicated by the symbol -h> used instead of 
the — > in the production specification. The keyword homomorphism reminds us 
that, in the simplest case, the interpretation rules define a homomorphic im- 
age [12, 39] of the string that has been generated using productions and decom- 
position rules. In general, however, the interpretation rules extend the concept of 
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homomorphism, because they may operate on symbols with parameters, and can 
be applied in hierarchical and recursive manners. In that sense, they resemble 
decomposition rules. 

For example, to visualize the structures generated by Models 2 or 3, one 
could add the following lines before the endl system statement: 

homomorphism 

A(t) — > F(t/t_max) /* rule hi */ 

I(t) — > F(0. 5*2" (t/t_max)) /* rule h2 */ 

These rules replace modules A(t) (the apices) and I (t) (the internodes) with the 
modules F(x) , which are interpretated as straight line segments of length x [37]. 
According to rule hi, the length of an apex increases linearly with the apex 
age t and reaches 1 at the division time, t = t joaax. According to rule h2, the 
length of an internode will double over every interval t joaax. Due to the constant 
0.5, a pair of newly created internodes will have the same combined length as 
the apex that created them. This guarantees that the leaf shape will change 
continuously with time (the model will satisfy the continuity criterion [37]). 
In conclusion, the interpretation rules make it possible to separate the logic 
of developmental programs from model visualization. The resulting models are 
more clearly organized and easier to understand than the models in which both 
aspects of specification are interwoven. 



3.5 Sub-L-systems 

It is convenient to apply concepts of structural programming to plant modeh 
ing and allow the modeler to partition large models into relatively independent 
parts. For example, a modeler may want to specify the overall development of a 
plant branching structure separately from the development of individual plant 
organs, such as leaves and inflorescences, then combine these specifications into a 
comprehensive plant model. Such a structured approach increases the efficiency 
of model design and makes it possible to reuse the same components in different 
models. 

The partition of a developmental model into components is conceptually more 
difficult than the inclusion of predefined shapes into a static structure [10, 37], 
since the components may undergo changes as time progresses. To support the 
partitioning of developmental models, Hanan introduced the notion of sub-L- 
systems [11]. Sub-L-systems can be compared to subroutines in that they are 
invoked from the main L-system or other sub-L-systems to perform well defined, 
encapsulated tasks. Unlike sub-routines in a sequential program, however, the 
main- L-system and the sub-L-systems may be active at the same time. Thus, 
decomposition of a developmental model into the main L-system and a set of 
sub-L-systems preserves the parallel rewriting inherent in L-systems. 

From the viewpoint of formal language theory, sub-L-systems are related 
to continuous grammars [6], in which the L-system- like rewriting mechanism is 
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applied to subwords of the rewritten word. Sub- L-sy stems generalize this con- 
cept by allowing productions from different sets to be applied simultaneously to 
specific substrings of the rewritten string in any derivation step. 

In the cpfg language, separate parts of a model are identified by the numerical 
identifier id following the keyword Isystem (cf. Section 3.1). The main L-system 
is the first one in the model. To invoke a sub-L-system, the calling L-system 
produces a pair of reserved modules ${id) and $, which delimit the substring 
to be rewritten using productions of L-system id. These delimiters must occur 
in matching pairs and may be nested within the string. The module with the 
id parameter invokes the sub-L-system, while the module without parameters 
returns control to the higher-level set of rules (Figure 4). 



axiom 



applying 
main L-system 



. .. $(idi) ... $... 



^^^applying^::^ applying // applying 
fn^n L-system idp/ main L-system 



... $(id,) ...$(1) 


...$...$ . . 


. $(id2) 


...$... 


applying 


applying 


main\\ sub \\ 


main 


sub \\ main\ 


main L-system^ 


/^ub L-system id.. 


L-sys\\L-sys. idy^ 


\ L-system 


L-sys. id^ L-sys. 



Fig. 4. Example of a developmental sequence generated by an L-system with two 
snb-L-systems 



For example, the following schematic model makes use of the sub-L-system 
mechanism to separate the development of a monopodial inflorescence [37] from 
the development of the individual flowers. 

Model 4 



Isystem: 1 


/* 


main L-system 


*/ 


derivation length: 3 








axiom: A 


h 


initial string 




A — > I[$(2)A$]A 


/* 


production pll 


*/ 


endl system 








Isystem: 2 


h 


sub-L-system 


*/ 


derivation length: 1 


/* 


ignored 


*/ 


axiom: ABC 


h 


ignored 


*/ 


A — > B 


h 


production p21 


*/ 


B — > C 


h 


production p21 


*/ 



endl system 




406 Przemyslaw Prusinkiewicz et al. 



This model generates the following sequence of parametric strings: 

A 

I[$(2)A$]A 

I[$(2)B$]I[$(2)A$]A 

I [$ (2) C$] I [$ (2) B$] I [$ (2) A$] A} 

In each step, the production that is applied to an apex A depends on the sub-L- 
system identified by the delimiters immediately enclosing it. Production pll is 
applied to the apex A at the right end of each string, creating an internode I, a 
lateral branch incorporating the sub-L-system reference $(2)A$, and a new apex 
A. Production p21 is applied to the module A appearing in the newly created 
branch, producing a blossom B. In the next step, production p22 will transform 
this blossoms into a fruit C. The axiom and the derivation length of the sub-L- 
system do not affect the simulation, but are needed when the sub-L-system is 
being developed and tested independently of the main L-system. 



4 Evaluation and conclusions 

Consecutive versions of cpfg and its predecessor pfg have been developed and 
used over the last 15 years to support- research in plant modeling, computer 
graphics, and botany. This long life span of cpfg results, first of all, from the 
soundness of the L-system theory that underlies its design. The L-system-based 
modeling language described in this paper is the essential feature of cpfg and 
provides the following benefits [28]: 

- At the conceptual level, it facilitates the design, specification, documenta- 
tion, and comparison of models. 

- At the level of model implementation, it makes it possible to develop soft- 
ware that can be reused in various models. Specifically, graphical capabilities 
needed to visualize the models become a part of the modeling program (cpfg), 
and do not have to be reimplemented. 

- Finally, the language facilitates interactive experimentation with the models. 

The cpfg modeling language incorporates many constructs that extend the 
notion of L-systems as used in formal language theory. Global variables and C- 
like functions make it possible to input experimental data to the models and 
output simulation results for further statistical analysis. Decomposition, inter- 
pretation rules, and sub- L-systems lead to conceptually clear and well structured 
model specifications. These capabilities play an essential role in the applications 
of cpfg to biological research and image synthesis. 

The cpfg modeling language inherits the conciseness of the mathematical no- 
tation of L-systems on which it is based. The user can specify simple models with 
only a few lines of code, and modify them easily during experimentation. The 
concise notation also emphasizes the conceptually intriguing database amplifi- 
cation property inherent in many L-system models, Le,^ the contrast between 
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the short specification of the models and the intricacy of the resulting structures 
and behaviors [40] . 

Our experience with cpfg, and that of other users, has also highlighted short- 
comings of the present cpfg language. To begin with, a down side of the concise 
notation is that large cpfg models look cryptic. The challenge is thus to define 
a more legible language based on L-sy stems. It should improve the clarity of 
specification, documentation, and ease of maintenance of complex models, while 
still allowing for compact specification of simple models. The first step towards 
this goal might be the incorporation of additional constructs found in other 
programming languages. The most needed ones appear to be: multi-letter mod- 
ule type identifiers providing an alternative to one-letter symbols, user-definable 
functions, and user-definable data structures that could be passed as parameters 
to modules^. The data structures would eliminate the need for the long param- 
eter lists that currently must be included in each reference to a cpfg module 
with many parameters. This goal can also be achieved using name- value pairs, 
as suggested by Borovikov [4]. 

The current format of production specification also could be improved. In 
many models the same predecessor module yields different successors depending 
on the context and the logical conditions that guard production application. 
According to the cpfg syntax, specification of different successors requires the 
use of separate productions. Thus, the same statement block a may have to be 
specified and executed several times, separately for each production, in order to 
provide arguments to different conditional expressions cond (cf. Section 3.2). An 
alternative is to introduce a more flexible production format that would allow for 
the selection of one of several successors depending on the production's context 
and conditions. A sample production syntax satisfying that requirement was 
proposed by Hammel [9] . 

The impact of using the bracketed string notation to specify productions op- 
erating on axial trees should also be re-examined. For example, strings A[B][C]D 
and A[C][B]D represent the same tree, yet the context-sensitive production 
A > [B] ^ JA will only apply to the first representation. Ideally, the result 
of production application should not depend on the form of the string represent- 
ing a given tree. Furthermore, an axial tree may have an indeterminate number 
of branches attached to the same node, but the bracketed string notation only 
makes it possible to specify a finite number of them as a production context. 
Consequently, concepts such as “a signal coming from any branch'’ or “signals 
coming from all branches'’ cannot be expressed, at present. One approach to ad- 
dress these problems may be to consider the rewriting of trees as a special case of 
a graph rewriting mechanism rather than a generalization of string rewriting, and 
examine the notions of graph L-systems [5, 29] and parallel graph grammars [7] 
as a possible foundation of an alternative plant modeling language. 

Apart from the practically motivated improvements to the cpfg language, 
an interesting problem is the characterization of L-sy stem-based languages in 

^ Multi-letter module names and simple function definitions can be introduced in the 
current cpfg models using a macro preprocessor. 
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relation to other programming languages. Two of the observed relations are 
listed below. 

- A production, for instance F(x) ^ F(2*x), can be read as follows: “If a mod- 
ule is of type F and the value of its parameter is x, then it will be replaced by 
a module of the same type F, with the parameter value multiplied by two.” 
Thus, a production is a declarative statement, and a cpfg model consists of a 
set of such statements. This relates the cpfg language to the declarative style 
of programming found, in particular, in logic programming languages. The 
relation between L-systems and declarative programming was first studied 
by Lewis, and led to a concise implementation of the L- system derivation 
mechanism in Prolog [18]. The declarative aspects of L-system models still 
require an in-depth analysis. 

— An L-system derivation step captures time advancement by some interval. 
The time component inherent in production application relates L-systems 
to simulation languages. This relation was partially explored by Hogeweg 
[13, 14] and Hammel [9], who implemented L-system models in Simula. 

Over the years, the notion of L-systems led to the development of an entire 
branch of formal language theory, and became an inherent component of archi- 
tectural plant modeling. We expect that further research will lead to an improved 
design of practical plant modeling languages based on L-systems, and a better 
understanding of the place of L-systems in programming language theory. 
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Abstract. This paper gives a short introduction to TreebaGj a system 
that employs tree grammars and tree transducers in order to generate 
and transform strings, trees, numbers, pictures, and other data types. 



1 lotrodoction 

In the field of programming language definition and compiler generation, two 
central notions of immense theoretical and practical value are the syntactic 
generation and the syntax-directed translation of programming languages. The 
context-free part of the syntax of a programming language is usually defined 
by means of some grammatical device, which has the useful side-effect that the 
structure of a syntactically correct program can be represented by its derivation 
tree. Practically all modern approaches to programming language compilation 
rely on the availability of derivation trees and exploit their structural informa- 
tion in order to translate a source program into the intended target language. 
There is a common, well-known term for this approach: syntax-directed trans- 
lation. The input is a derivation tree of a program and the translation process 
transforms it into a program in the target language. 

Since the output of this process is again a program which obeys a specific 
syntax, it has a derivation tree, too. Therefore, compilation may be regarded 
as a tree-to-tree transformation, also called a tree transduction, which turns 
an input tree into an output tree. Thus, in order to develop a deeper under- 
standing of the matter it is useful to abstract from the programs behind and to 
study classes of tree grammars generating trees and classes of tree transducers 
transforming them. For the sake of simplicity and generality, the restriction to 
derivation trees is dropped and arbitrary rooted trees with node labels taken 
from a ranked alphabet (where the rank of each node coincides with the rank 
of its label) are considered. Thus, these trees are in fact terms or expressions 
over some unsorted set of function symbols. The investigation of tree languages 
and tree transductions began in the early seventies by the work of Rounds and 
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Thatcher [Rou69,Rou70a,Rou70b,Tha70,Tha73] and has led to a rich and inter- 
esting theory which is still growing (see [GS84,GS97,FV98]). 

On a certain level of abstraction, a program represented by a derivation tree 
may be perceived as the semantics of that tree. In other words, the evaluation 
of the tree results in a string — the program. However, it is clear that the strings 
could be replaced with any other data type. After all, trees can be interpreted 
as expressions in any semantic domain. This bears a very crucial point every 
student of computer science should learn, namely to make a distinction between 
syntax and semantics and to understand that 3 * (5 + 7) need not necessarily 
denote a number (although it usually does). 

The system Treebag (tree-&ased generator) presented in this note exploits 
the observations made above. It allows to generate and transform trees using 
tree grammars and tree transducers of several descriptions, and to evaluate the 
output trees in various semantic domains (called algebras). In addition, display 
components are available, allowing to visualise the resulting objects. Thus, there 
is a strict distinction between syntax (i.e., the generated trees) and semantics 
(i.e., their value with respect to some algebra). Different views of one and the 
same tree are easily obtained by using several algebras in order to evaluate it. 

This paper is intended to give a short introduction to Treebag, to explain 
its overall structure and the ideas behind, and to illustrate it by means of an 
example. 



2 The structure of the system 



Treebag is a window-based system written in pure Java (JDK 1.1). Its main 
window is called the TREEBAG worksheet^ in which the user can interactively 
build a network of TREEBAG components. The available classes of Treebag 
components are divided into four categories, namely tree grammars, tree trans- 
ducers, algebras, and displays. An instance of a class is defined in a text file 
with a specific syntax and can be loaded onto the worksheet, where it is repre- 
sented as a node. These nodes can be connected via edges in order to establish 
input/output relations. Drawing an edge from a tree grammar or tree transducer 
to another tree transducer or to a display defines the output trees of the former 
to be the input trees of the latter. If the target component is a display it will 
visualise the value of its input trees in a separate window. However, in order to 
enable it to do so, it must also be connected with an algebra which determines 
the semantics of the input trees received. 

Thus, the types of objects that one may generate within Treebag are deter- 
mined by the available classes of algebras. There are several such classes, the most 
interesting ones perhaps being those which yield graphical objects: chain-code 
algebras, turtle algebras, and collage algebras. The first two yield line drawings; 
their most important operation is the concatenation of line drawings. Collage 
algebras are based on n-ary operations which transform their arguments using 
n affine transformations, one for each argument, and then take the union of the 
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transformed pictures. For a theoretical study of picture generation using these 
algebras see [Dre98]. 

However, other algebras are useful as well. The free term algebra, which 
associates with every tree the tree itself, is one of these. Together with a display 
that visualises trees in a graphical or textual form it makes sure that one can 
always have a look at the “unevaluated” tree. 

3 An example 

A very simple sort of tree grammar is the regular tree grammar.^ being a rather 
straightforward generalisation of the type-3 Chomsky grammar to the tree case. 
Here is a typical example: 

Barnsley = ({S, A}, 

{/: 4:,g: 4:, line: 0}, 

{S ^ f[line, S, A, A], S ^ line, 

A ^ g[line, A, S, S], A ^ line}, 

S). 

The rules in the third and fourth line are the important part. They should be 
read as term rewrite rules replacing the two nonterminals S and A nondetermin- 
istically, where S is the start symbol. The symbols /, and line (of rank 4, 4, 
and 0, respectively) are the terminals or output symbols. 

Having created a text file describing this grammar (using almost exactly the 
syntax above) one can load it onto the worksheet as a component. Adding two 
further components, namely the free term algebra and a display for trees, the 
configuration shown in Fig. 1 is obtained. The buttons on the small panel, which 
can be opened by double-clicking on the node representing the grammar, allow 
to control the behaviour of the grammar. 

In the next step, the trees generated by the grammar shall be interpreted by a 
so-called collage algebra, and the result visualised by an appropriate display. The 
algebra interprets line as a vertical line and / as an operation that transforms 
its arguments roughly like this: 




The symbol g is interpreted in a similar way, but with slightly different 
transformations (can you see the difference, looking at Fig. 2?). In addition, the 
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-I Tree-like term display; (input: Barnsley fern) 



-1 TREEBAG 1.1 worksheet GUI 


0 
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Fig. 1. Generating trees by the regnlar tree grammar Barnsley 




Fig. 2. Interpreting generated trees by a collage algebra 



algebra interprets the nonterminals S and A (which, after all, are just symbols of 
rank 0) as a square and a triangle, respectively. This has the advantage that even 
nonterminal trees yield pictures, in which the nonterminals are visible as geo- 
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metric parts. Fig. 2 shows such a situation. The generated pictures approximate 
a variant of the well-known Barnsley fern (see [BarSS]). 

It should be mentioned that collage algebras are able to save the displayed 
pictures as PostScript files. Thus, it is not necessary to rely on screenshots like 
those in Fig. 1 and 2. As an example, one of the pictures produced by the 
configuration of components in Fig. 2 is depicted in Fig. 3. 




Fig. 3. One of the pictures produced by the Treebag components in Fig. 2 

Finally, we shall enrich the configuration by a tree transduction tt. It is a 
so-called top-down tree transduction, and is given by the rules 



As in the case of regular tree grammars, the rules are to be interpreted as term 
rewrite rules, a derivation on input t starting with the tree tt[t]. 

If the collage algebra is extended in order to interpret tri by applying the 
famous transformations for the Sierpihski triangle, triangle as a triangle, and tt 



tt[h[xi j X 2 j x^]] ^ tri[tt[xi]j tt[x 2 ]j tt[xz]] for h G {fj9}j 
tt[Une] ^ triangle j 



tt[line] 



m 

tt[A] 
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as the identity (to take care for transformations not yet terminated), configura- 
tions as in Fig. 4 are the typical ones (notice that the tree display is now used 
to show the output trees of the tree transduction). 




Fig. 4. Transforming the Barnsley fern into the Sierpihski triangle 



4 Concluding remarks 

The system Treebag presented in this note is a flexible tool to demonstrate and 
explore mechanisms for the syntactic generation of pictures and other data types. 
The system is implemented in Java (JDK 1.1) without any native extensions 
and can be obtained at http://www.informatik.uni-bremen.de/~drewes/treebag, 
including a short manual and lots of examples ranging from very simple ones — 
like the ones used for demonstration purposes in this paper — to more elaborated 
ones. A sample collage from one of the more complicated examples is pictured 
in Fig. 5. 
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Fig. 5. This example can be found in . . . /treebag/ examples /collage/tilings /coloured/ 
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Abstract. In this contribution we present tool support addressing the multiple 
perspectives problem. For organizing and integrating multiple stakeholders, the 
development processes and notations they use, and the partial specifications 
they produce we use the ViewPoints framework. A formalization of the View- 
Points framework based on distributed graph transformation builds the founda- 
tion of our tool. 



Introduction and Related Work 

The concepts wrt. the ViewPoints framework as well as its formalization by distrib- 
uted graph transformation have been sketched in [2, 3]. A detailed description of the 
graph transformation tool AGG on which our tool is based can be found in [4]. 

First we will give a general introduction to our tool in the chapter The ViewPoint 
Tool. Then we will present the three main tool components in the chapters ViewPoint 
Manager, Template Editor and ViewPoint Editor. Finally, we will refer to advanced 
features of our tool in the chapters Inconsistency Management and Distribution. 



The ViewPoint Tool 

The ViewPoint Tool comprises the ViewPoint manager, the template editor and the 
ViewPoint editor (cf. Figure 1). While the ViewPoint manager serves as a central tool 
to coordinate all activities within using the ViewPoints framework, the template edi- 
tor allows to design ViewPoint templates - i.e. styles combined with work plan ac- 
tions - and the ViewPoint editor allows to work with specifications and work records 
of actual Viewpoints. All ViewPoints used in the ViewPoint editor have to be instan- 
tiated from existing ViewPoint templates developed by the template editor. This tool 
structure has been influenced by the distinction of the 'method engineer’ and the 
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‘method user’ working areas [3] in the sense that the main task of a method engineer 
is to develop ViewPoint templates using the template editor, and the method user 
builds Viewpoint systems with actual specifications based on a library of ViewPoint 
templates. 




Fig. 1. Structure of the ViewPoint Tool. 

In the next three chapters we will now present the ViewPoint manager, the template 
editor and the ViewPoint editor in detail. 



ViewPoint Manager 

The ViewPoint manager serves to organize all developed ViewPoint templates and all 
actual Viewpoints. It is used as a starting point to enter the template editor and the 
ViewPoint editor (cf. Figure 2). 

First ViewPoint templates can be created which then can be developed further 
within the template editor. After finishing the specification of a ViewPoint template, 
the ViewPoint manager offers a compile function, i.e. the underlying graph transfor- 
mation model has to be enriched with additional ViewPoint- specific information (cf. 
section inconsistency management). Then actual ViewPoints can be instantiated from 
a compiled template which are usable within the ViewPoint editor. 

The upper half of the ViewPoint manager window depicted in Figure 2 is used to 
access all ViewPoint templates in development while providing the following addi- 
tional information: 

• in design: a corresponding template editor window is currently open, 

• uncompiled: no corresponding template editor window is open and the template 
has not yet been compiled, 

• free: the compilation process has been terminated correctly and actual ViewPoints 
can be generated from this template, 

• #*.• # symbolizes the number of instantiated ViewPoints, 

• protect/inj.: for this ViewPoint template the Double Pushout approach is used 
within the underlying graph transformation formalization. 

The lower half of the ViewPoint manager window depicted in Figure 2 is used to 
access all actual ViewPoints providing additional information about the ViewPoint 
owner and the status of corresponding ViewPoint editor windows {open/close). 
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Fig. 2. Viewpoint manager window. 



Template Editor 

The template editor allows to edit the work plan slot and the style slot of a ViewPoint 
template. All actions of the ViewPoint template’s work plan have to be modeled as 
graph transformation rules (cf. Figure 3). 




Fig. 3. The work plan window of the template editor. 

As mentioned in [2, 3], the style slot of a ViewPoint template is represented by a 
graph transformation system the rules of which are assembly actions. Figure 4 shows 
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the style window of the template editor listing all assembly actions as well as the 
graph transformation system’s start graph. 

Please note that all work plan actions of a single ViewPoint template have to be 
specified in the same style. E.g., if the style slot encapsulates a graph transformation 
system for generating ER diagrams, all graph elements in the work plan’s actions 
have to be valid ER elements defined by the assembly actions and no other types of 
graph elements must be used within this template (cf. chapter Distribution for defin- 
ing Inter- Viewpoint check actions involving multiple styles). 



FormularFOSPL : new 1] 



File Edit View point Action Node Arc Info 
Style Domdin Workpltin 

untag aU II cut selected j 

0 move O cut 
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Create new node with mouse click. 





Fig. 4. The style window of the template editor. 



Viewpoint Editor 

The Viewpoint editor allows to edit a ViewPoint instantiated from a ViewPoint tem- 
plate developed by the template editor. All actions defined in the corresponding tem- 
plate’ s work plan can be applied in order to build an actual specification. Eigure 5 
depicts a ViewPoint editor window, the actual specification is displayed on the left 
and all work plan actions are listed on the right. 
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Fig. 5. The View Point editor. 

After presenting the three main components of the ViewPoint Tool - the ViewPoint 
manager, the template editor and the ViewPoint editor - we will now sketch how 
inconsistency management is supported. 



Inconsistency Management 

The Viewpoints framework proposes the policy of living with inconsistencies, i.e. to 
tolerate them and to exploit the information involved to push forward the develop- 
ment process [3]. In [1] an inconsistency management reference model is presented 
comprising actions like ignoring inconsistencies, delaying their resolution, circum- 
venting the consistency requirement, incrementally ameliorating the situation or total 
resolution. In [3] we have sketched how this inconsistency management model can be 
realized by distributed graph transformation. 

In order to realize inconsistency handling functionality such as incremental resolu- 
tion or ‘undo’ operations wrt. to development steps, we need a mechanism for track- 
ing development steps. As mentioned above this is given by the development tree in a 
Viewpoint’s work record where each edge represents a development step (i.e. the 
application of a work plan action). However, for implementing these ideas we have to 
manage additional information, e.g., to store old system states in the case of ‘undo’ 
operations or to establish links between graph elements of the work record and the 
specification for tracking desired or inconsistent system states. The ViewPoint Tool 
allows to visualize this internal information as gray shaded graph elements (cf. Figure 
6). The two root nodes SPEC and WR denote the specification slot and the work rec- 
ord slot, respectively, and the nodes labeled step indicate development steps. 
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Fig. 6. The ViewPoint specification shown in Figure 5 with additional internal information. 

By maintaining this internal information we have provided the foundation for incon- 
sistency management. Currently, inconsistency handling is realized by defining check 
actions in a ViewPoint template’s work plan. The implementation of an incremental 
inconsistency resolution model which makes intensive use of the work record is cur- 
rent work. 

In the next chapter we will now tackle distribution aspects. 



Distribution 

Since the implementation of distributed graph transformation within the tool distrib- 
uted AGG is not yet finished, currently the ViewPoint Tool uses the local version of 
AGG [4]. Thus, the concepts of distributed use / correspondence relations within 
distributed graph transformation as introduced in [3, 2] are not yet supported. Cur- 
rently, for modeling Inter- ViewPoint check actions between distributed ViewPoints 
we support the use of positive application conditions PACs. While all actions in a 
ViewPoint template’s work plan have to be specified according to the template’s style 
(see above), a PAC may contain a graph structure in the style of another ViewPoint. 
Thus Inter- ViewPoint checks involving different styles can be realized and integra- 
tion of different views is enabled. E.g., the left hand side rule graph of an Inter- 
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Viewpoint check action defined in an ER diagram ViewPoint template may contain 
an ER structure while its PAC may contain a Petri Net specification. 



Conclusion and Further Work 

In this contribution we have presented tool support for our approach of ViewPoint- 
oriented software development based on distributed graph transformation. While we 
have given a detailed description of our approach using distributed graph transforma- 
tion as underlying semantics of the ViewPoints framework including several exam- 
ples as well as an elaboration of our approach’s benefits in [3, 2], this contribution 
focuses on the ViewPoint Tool. 

The ViewPoint Tool presented in this paper is based on the local version of AGG 
[4]. Currently we are working on integrating the features of a prototype AGG version 
realizing distributed graph transformation. 
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Abstract. This paper gives a brief introduction to the UPGRADE 
framework which is designed to support generating code for structure- 
oriented software engineering tools from graph-based specifications. The 
framework is written in Java and contains a variety of mechanisms for 
configuring the representation of internal models of generated tools. The 
tool designer can therefore build tools which offer to the user a conve- 
nient representation of a visual language instead of restricting him to a 
predefined notation. Generation of a tool requires no additional program- 
ming effort after finishing the specification. It is thus easy to alter the 
tools internal logic just by changing its specification and by generating 
a modified tool version from it. 

Keywords: structure-oriented (software) engineering tools, graph-based 
specification, generation of tools, adaptation to different representations, 
configuring visual tools 



1 Introduction 

UPGRADE (Universal Platform for Graph-based Application Development) is 
a framework for generating structure-oriented (software) engineering tools which 
are based on graph and graph rewriting as the internal modeling formalism. The 
models constructed and manipulated by these tools are based on some graphical 
language. In contrast to graphical editors, a structure-oriented engineering tool 
is aware of the semantics of the visual languages^ elements. It is therefore able to 
restrict the user to those operations on the model which preserve its consistency. 

The elements, structural constraints and operations of a model are formally 
specified as a graph based meta-model in the executable specification language 
PROGRES [8,9]. The language PROGRES combines concepts from database 
systems, procedural and rule-based programming. The graph structure is spec- 
ified by a graph schema which defines types of nodes, edges, and attributes. 
Graph transformations are described by graph rewrite rules which essentially 
consist of a left-hand side — the graph pattern to be replaced — and a right- 
hand side — the replacing graph pattern. Programming with graph rewrite rules 
is supported by control structures such as sequence, branch, loop, etc. 
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By generating source code from the formal specification, we get an imple- 
mentation of a tooPs commands. In addition, a user interface is required to call 
these commands and to visualize the tools internal model. This user interface is 
provided by the UPGRADE framework. 

Special emphasis was put on the framework ^s configurability and flexibility. 
We are, therefore, able to construct engineering tools which, in spite of being 
prototypes, provide the user with convenient representation of an arbitrary visual 
language and a comfortable user interface. 

In Section 2, 1 will sketch how the functions which implement the tools com- 
mands are generated from the formal specifications and how they are integrated 
into the UPGRADE framework. In Section 3, 1 will discuss the main features of 
the framework. Section 4 concludes this presentation. 



2 Specification and Generation of Tools 



This section gives an overview on the process of tool generation. A more detailed 
description can be found in [3] . 

Graph-based specifications are executable in the integrated development en- 
vironment of PROGRES. By executing a specifications^ rewrite rules, also called 
productions^ graphs can be constructed and manipulated. These graphs are 
stored in the underlying special purpose database GRAS (GRaph Storage) [6]. 
Productions can be grouped into transactions on the database which have the 
usual ACID properties. 

The transactions can be viewed as basic operations on the model (the graph 
stored in the database). By executing a transaction, the model is manipulated in 
a consistent and formally defined manner. The PROGRES specification therefore 
constitutes a meta model of a tool by specifying its internal data structure (the 
model) and the operations, a user can perform on it. The task of the UPGRADE 
framework is to provide the user with software components for a convenient 
visualization of the internal graph structures and for comfortable invocation of 
the specified commands. 

From PROGRES specifications it is possible to generate C-code for each of 
the transactions. The generated code calls the interface functions of the database 
in exactly the way the PROGRES interpreter does when executing a specifica- 
tion. The generated C-Code therefore efficiently implements the specified com- 
mands for the construction and manipulation of models. 

The UPGRADE framework, which is implemented in Java, allows the con- 
struction of tools which call the generated code and receive events from the 
database whenever changes occur. It allows the designer of the tool to present 
the elements of the graph to the user exactly the way he wants his visual language 
to look like. As an example. Figure 1 shows a tool for managing development 
processes which was specified in PROGRES and implemented using the UP- 
GRADE framework. The tool is based on the formalism of Dynamic Task Nets 
for process modeling [2]. 
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Moreover, through the decoupling of internal logic, provided by the generated 
code, and visualization, provided by the framework, it is easy to change the tooPs 
underlying meta model by simply changing the PRO GRES specification and 
generating new source code from it. Tool designers can therefore use PROGRES 
and the UPGRADE framework for rapid prototyping in early design phases when 
the tool design is still unstable and explore the consequences of design decisions 
interactively. 




Fig. 1. Process management tool based on UPGRADE 



3 Main Features of the Framework 

A main goal when developing the UPGRADE framework was to design it as 
flexible and configurable as possible. Instead of restricting the user to predefined 
shapes for displaying the internal graph, the UPGRADE framework enables the 
user to easily integrate his own Java classes for node and edge visualization by 
inheriting from the existing ones. As the intended use of the framework is to 
generate prototypes of tools that implement a visual language, it is important 
that the appearances of nodes and edges match the the language designer's 
intention as close as possible. 

We distinguish two user roles. The tool designer is the one who uses the 
framework for building a new tool. In some cases this might require an exten- 
sion of the framework to achieve the desired graphical representation but we 
expect that the ongoing usage of the framework will lead to a library of classes 
representation elements (nodes and edges) which will satisfy most designers^ 
needs. 
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The main task of the tool designer is to configure the existing parts of the 
framework: 

— At first, a mapping from the model inside the database to the graph diagram 
which shall be displayed by the tool has to be defined. Graph-based models 
usually contain a lot of internal elements that are not part of the visual 
language to be presented to the user. We employ several levels of filters 
above the database to provide the upper layers of the framework with a 
simplified view on the model. The definition of these mappings, as well as 
the other configuration steps to be mentioned in the following, require no 
additional programming. They are performed solely through dialog boxes 
and by editing configuration files. 
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Fig. 2. Choosing and configuring representation classes 

- For each node and edge class of the graph schema, the tool designer has to 
select an appropriate Java class for displaying it. It is possible to relate such 
a representation class to a class or type of the graph schema as well as to 
let it depend on the values of attributes of a node or edge. In this way, we 
can even use different Java classes for visualizing different states of model 
elements (see Figure 2). 

- After having selected the representation classes, the tool designer has to 
configure them, e.g. by selecting shapes, sizes, colors, label positions, etc. 

- The framework offers a number of layout algorithms the behavior of which 
can be adapted as well. By combining these adapted layout algorithms it is 
possible to arrange the representation elements according to a complex set 
of rules. 

For example, the layout of the task net of the process management tool in 
Figure 1 has been calculated in two steps. Firstly, the tasks of the process, 
which are represented by boxes, were arranged from left to right according 
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to the flow of control, represented by solid arrows, using the Sugiyama ah 
gorithm[10]. The Sugiyama algorithm was configured, by means of dialog 
windows, to ignore all other node and edge types in this phase (see Figure 
3). In a second step, a constraint solving layout algorithm was used to attach 
the input and output parameters, shown as circles, to left and right edges of 
their tasks. 




Fig. 3. Configuration of layout algorithms 

- To make the commands, which are specified as transactions on the graph 
database, accessible to the user, we have to define menus and tool bars of 
the tool and the commands which should be invoked by them. This is done 
within an XML configuration file which is read when the tool starts. 

- Finally, within another configuration file, the combination of different views 
and their interaction within one Window can be specified. For example, the 
process management tool of Figure 1 combines a tree view, showing the task 
hierarchy, in the left part of the window and a net view, showing the tasks 
ordered according to the flow of control, in the right part of the window. The 
generated tools employ a multi- view / multi-user paradigm. One tool instance 
may open several views on a model located in the database and there may 
also be different instances of the tool working on the same model. Within a 
single tool instance the views can be coupled which means that commands 
invoked in one view can affect how another view displays its content 

Besides the tool designer, there is the role of the tool user. Being the one 
working with the final tool he can only configure it to a limited extend. He may 
choose between alternative ways for representing model elements if those have 
been specified by the tool designer or he may invoke a layout algorithm, but 
he is not allowed to define his own representations or to configure the way the 
layout algorithms work. 

4 Conclusion 

By using the UPGRADE framework, a tool designer is able to build a structure- 
oriented (software) engineering tool which on one hand is based on a formal 
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specification and on the other offers a convenient user interface. The process of 
building a tool requires no additional programming effort as soon as the PRO- 
GRES specification is completed. This approach minimizes the effort necessary 
for changing a tools internal logic and thus enables the tool designer to explore 
the consequences of design decisions interactively. It also offers the possibility 
to incorporate knowledge gathered while applying a tool into the next version. 
Thereby, it is possible to construct tool versions which are especially adapted 
for specific application domains [7, 5] . 

The UPGRADE framework is applied within several projects in our depart- 
ment. Among these is the AHEAD project which is concerned with the con- 
struction of an adaptable hum an- centered environment for the management of 
development processes [4]. Moreover, the framework is used for constructing a 
high-level authoring support environment [1]. 
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Abstract* DiaGen is a specification method, which is primarily based 
on a hypergraph grammar, and a tool that allows to automatically gen- 
erate diagram editors from such a specification. Generated editors are 
free-hand editors, but with an automatic, constraint-based layout for 
correct diagrams. A hyper graph parser checks diagram correctness and 
makes it possible to translate diagrams into some user-defined semantic 
representation. This paper briefiy outlines DiaGen and the process of 
creating diagram editors with DiaGen. 



1 Introduction 

Today many systems communicate information through diagrams and there- 
fore have to contain diagram editors, i.e., graphical editors that are tailored to 
the corresponding class of diagrams. Examples are current UML tools, which 
offer editors for class diagrams, sequence diagrams and others [8], or visual pro- 
gramming tools for programmable logic controllers, which allow to edit ladder 
diagrams and sequential function charts [3]. When implementing such a diagram 
editor, two main problems have to be tackled: First, the diagram language must 
be exactly specified. The specification has to describe its syntax and its seman- 
tics, i.e., the rules how to build correct diagrams and the meaning of diagrams. 
The voluminous documentation on UML [8] shows that this is a nontrivial task. 
The second problem consists of implementing an editor which conforms to this 
specification. Such an editor supports either syntax-directed editing or free-hand 
editing. Syntax-directed editors offer a restricted set of editing operations which 
can be used to create and edit diagrams. On the other hand, free-hand editors 
provide no specific editing operations but allow to modify diagrams in any way 
and may thus produce incorrect diagrams. A parser is responsible for checking 
the diagrams^ syntax and for distinguishing correct from incorrect ones. 

The Diagra.m editor Generator DiaGen is a tool that offers support for the 
two problems which have been explained in the previous paragraph: DiaGen 
primarily uses attributed hypergraph grammars as a powerful means to specify 
diagram languages. Main parts of the editor are then automatically generated 
from this specification. In its current state, the tool creates editors with the 
following main features: 



M. Nagl, A. Schiirr, and M. Munch (Eds.): AGTIVE’99, LNCS 1779, pp. 433-440, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 




434 



Mark Minas and Oliver Koth 



- DiaGen currently generates only free-hand editors.^ A hypergraph parser 
checks the correctness of diagrams, creates their syntactic structure, and 
controls the semantic analysis. 

- Generated editors provide an automatic layout based on constraint solving. 
Currently, DiaGen uses as constraint solver either Qoca [6] or Parcon [4]. 

- DiaGen'" s generator as well as generated editors run under Java 2 and are 
thus portable to many computer platforms. 

- The current DiaGen version generates a stand-alone editor. Manual modifi- 
cations of this editor are necessary if such an editor has to be part of another 
system. Work in progress aims at solving this problem by generating the edi- 
tor as a software component which conforms to the JavaBeans standard [5] 
and can then be used with RAD tools. 

The concept of the generated editors and how they perform syntactic and 
semantic analysis are described in [7] within this volume. This paper presents 
DiaGen as a tool which is based on these concepts. Rooted trees as shown in 
Fig. la are used as a running example. We have chosen this very simple example 
because its complete specification fits on a single page (see Fig. 2). But we do 
not want to give the impression that DiaGen can only be applied to trivial 
examples. Real-world examples, which have been generated with DiaGenj are 
ladder diagrams (the running example of [7]) and sequential function charts, 
which are used to program programmable logic controllers. 

The next section briefly outlines how free-hand editors of DiaGen perform 
syntactic and semantic analysis of diagrams. Section 3 then shows the process of 
creating a diagram editor with DiaGen and explains the specification for rooted 
trees. Behavior and features of generated editors are discussed in Sec. 4. Section 5 
concludes the paper. 

2 Diagram Analysis 

Free-hand editors, which are part of a larger system, have to translate a diagram 
into some semantic representation. This section outlines the translation process 
as it is used by diagram editors which are generated with DiaGen and as it 
is described in more detail in [7]. It consists of four steps which are performed 
after each modification of the diagram: scanning, reducing, parsing, and semantic 
analysis. A DiaGen specification describes these steps for the specified diagram 
language. Section 3 discusses the specification for rooted trees. 

1. Scanning step 

Diagram components (e.g., circles and arrows in trees) are represented by 
hyperedges. Nodes represent the diagram components^ attachment areas^ i.e., 
the parts of the components that are allowed to connect to other components 
(e.g., the end points of an arrow). These nodes and hyperedges make up an 

^ However, syntax- directed editing as an additional mode has been designed and is 
currently being implemented. 
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Fig* 1* Representations of a rooted tree in an editor which has been generated with 
DiaGen: diagram (a), spatial relationship hypergraph (b)^ and rednced hypergraph 
model (c). 



unconnected hypergraph. The scanner connects nodes by additional edges if 
the corresponding attachment areas are related in a specified way, which is 
described in the specification. The result of this scanning step is the spatial 
relationship hypergraph (SRHG) of the diagram. Figure lb shows the SRHG 
of the rooted tree in Fig. la. Nodes are represented by black dots, hyperedges 
by ovals that are connected to their nodes. The inside-relation between an 
arrow^s end point and a circlets area holds if the arrow starts resp. ends 
within the circle. 

2. Redueing step 

SRHGs tend to be quite large even for small diagrams (see Fig. lb). In order 
to allow for efficient parsing, the SRHG is reduced first. This step is similar 
to the lexical analysis done by compilers. The result of this redueing step is 
the actual hypergraph model (HGM) of the diagram. The reducer is specified 
by some transformations that identify sub-hypergraphs of the SRHG and 
build the HGM. Figure Ic shows the HGM of the rooted tree in Fig. la. 

3. Parsing step 

The syntax of the hypergraph models of the diagram language — and thus 
the main part of the diagram language's syntax — is defined by a hypergraph 
grammar. DiaGen supports context-free hypergraph grammars with embed- 
dings, i.e., productions are either context-free ones with a single nonterminal 
hyperedge on the production's left-hand side (LHS), or they are embeddings 
where the production's right-hand side (RHS) is the same as its LHS, but 
with an additional hyperedge. The second type of productions is not neces- 
sary for trees (see Sec. 3). 

4. Semantie analysis step 

The diagram's syntax structure, which has been identified by the parser, 
is used to create the diagram's semantic representation. The hypergraph 
grammar which is used for parsing is actually an attributed one. Similar to 
textual grammars, semantics are represented as attribute values of termi- 
nal and nonterminal symbols (hyperedges); semantic analysis is specified by 
attribute evaluation rules, which are assigned to grammar productions. 



Layout is specified in a very similar way: the diagram's syntactic structure 
is used to set up a constraint system on the diagram components^ parameters. 
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Diagrams which have been recognized as correct therefore try to preserve their 
syntax and semantics even when modified by the user. 

3 Specifying diagram editors 

The process of creating a graphical editor for a specific diagram language with 
DiaGen consists of three steps: 

1. Specify the diagram language 

The diagram language is defined in the DiaGen specification language. The 
specification has to contain complete descriptions of the reducer, the parser, 
and the layout constraints (however, constraints are optional). The different 
diagram components, the scanner, and the semantic analysis are specified 
partially. In the following, this is demonstrated by means of a specification 
of rooted trees. 

The specification language is currently a textual language. As soon as this 
language will have settled, it is planned to build a diagrammatic one on top. 
DiaGen can then be used to generate a visual specification tool based on 
that language. 

2. Run the DiaGen generator 

The DiaGen generator reads the specification and creates some Java classes, 
which are ready-to-use and which implement the reducer, the parser, the 
core for semantic analysis, and the optional constraint solver for automatic 
layout. Furthermore, some Java class skeletons are generated which have to 
be fleshed out by the user. The skeletons ensure that these classes conform 
to the DiaGen API and runtime system. 

3. Flesh out class skeletons and add additional Java classes 

These classes implement the different diagram components of the diagram 
language (the user, e.g., has to add code for the visual appearance of the com- 
ponents), the detection of relationships between such components, and those 
parts of the semantic analysis which create user-defined data structures. The 
DiaGen API provides easily customizable standard classes to simplify this 
process. 

The rest of this section describes the specification of rooted trees which is 
shown in Fig. 2. Numbers are line numbers of this specification. 

A specification starts with a declaration part (1-11) and then contains a 
specification of the reducer rules (13-28) and grammar productions (30-54). 
The specification declares the Java package (1) which will contain the gener- 
ated classes and class skeletons. Declarations of diagram components (3-4) and 
relationships between components (5) specify the corresponding hyperedge types 
along with the number of tentacles in brackets. The specification also contains 
the names of generated class skeletons for those components in curly braces. Ad- 
ditionally, the types of attachment areas for each component have to be specified. 
An arrow, e.g., has two attachment areas of type ArrowEnd and is implemented 
in a class Arrow. The generated class will contain this information. Code that 
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1 package editor . tree ; 



3 component 

4 

5 relation 

6 terminal 



circle [1] 
arrow [2] 
inside [2] 
node[l] 
child [2]; 



{ Circle [Circle Area] }, 

{ Arrow[ArrowEnd,ArrowEnd] }; 

{ ArrowInside[ArrowEnd5CircleArea] }; 
{ Variable x, y, radius ; }, 



8 nonterminal Tree[l] { Semantics. Node root; Variable x, y^ 1 ^ r;}, 

9 Subtrees [2] { java, util . List sub; Variable y, 1, r, IRoot, rRoot;}; 



10 

11 constraintmanager diagen. edit or. param.QocaLayont Constraint Mgr; 



12 

13 reducer { 

14 circle (a) ==> node(a) { 

15 rhs [Oj. X = Ihs [Oj. param(O); rhs[0].y = Ihs [Oj. param(l); 

16 rhs [0]. radius = Ihs [Oj. param(2); 

17 radius = Ihs [Oj. param(2); 

18 constraints: radius <= 30; 

19 }; 



20 

21 arrow(b5c) inside (b^a) inside (c^d) circle (a) circle (d) ==> child(a5d) { 

22 axl = Ihs [Oj. param(O); ayl = lhs[0].param(l); 

23 ax2 = Ihs [0]. param(2); ay2 = Ihs [0]. par am (3); 

24 clx = lhs[3j. param(O); cly = lhs[3].param(l); 

25 c2x = Ihs [4]. param(O); c2y = lhs[4].param(l); 

26 constraints: axl == clx; ayl == cly; ax2 == c2x; ay2 == c2y; 

27 }; 

28 } 

29 

30 grammar { start Tree; 

31 Tree (a) ::= node(a) { 

32 SS.root = createNode(); 

33 constraints: $$.x == $0.x; $$.y == $0.y; $$.l == $0.x; $$.r == $0.x; 

34 }; 

35 

36 Tree(a) ::=node(a) Snbtrees(a5b) { 

37 SS.root = tree($l.snb); 

38 constraints: $$.x == $0.x; $$.y == $0.y; $$.l == $1.1; $$.r == $l.r; 

39 $0.y <= $l.y^50; $0.x >= Sl.lRoot; $0.x <= $1. rRoot; 

40 }; 

41 

42 Snbtrees(a5b) ::= child (a^b) Tree(b) { 

43 $$.snb = snbtrees($l.root); 

44 constraints: $$.l == $1.1; $$.r == $l.r; $$.y == $l.y; 

45 $$.lRoot == $l.x; $$.rRoot == $l.x; 

46 }; 

47 

48 Snbtrees(a5b ) ::= [ [ child (a^b) Tree(b) ] Snbtrees(a5C ) ] ! 

49 if b.x < c.x II b.x == c.x && b.y < c.y { 

50 $$.snb = snbtrees($l.root5$2.snb); 

51 constraints: $$.l == $1.1; $$.r == $2.r; $$.y == $l.y; $$.y == $2.y; 

52 $l.r <= $2.1-50; $$.lRoot == $l.x; $$.rRoot == $2.rRoot; 

53 }; 

54 } 



Fig. 2. DiaGen specification for rooted trees 
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actually paints an arrow on the screen and that allows user interaction with an 
arrow (modifying its size etc.) has to be added by the user later on. Relations 
have to be specified and coded similarly. This can typically be achieved by ap- 
plying some straightforward coding patterns and requires about 30 hand-written 
lines of code for a simple component (e.g., a circle or an arrow) and less than 5 
for a relation. 

Declarations of terminal (6-7) and nonterminal hyperedge types (8-9) also 
describe the attributes that are assigned to instances of those hyperedges. At- 
tribute type Variable represents a constraint variable which is used by the 
constraint solver for automatic layout. Semantics. Node is a user-defined data 
structure which is used by semantic analysis. Finally, the constraint solver en- 
gine, which will be used by the generated editor, has to be specified (11). DiaGen 
currently supports QOCA [6] and Parcon [4]. 

The reducer specification describes the reducing rules. Each rule specifies a 
pattern of the SRHG and the corresponding pattern of the hypergraph model. 
The LHS of each rule thus consists of component and relationship hyperedges 
while the RHS consists of terminal hyperedges only. Each hyperedge is textu- 
ally represented by its type and the visited nodes in parenthesis. Lines 14-19 
specify that circles in the SRHG map directly to node hyperedges in the HGM. 
Edges from parent to child nodes are more complicated; they consist of an ar- 
row which starts and ends in corresponding circles (21-27). Furthermore, rules 
specify attribute values of terminal edges in terms of the diagram components^ 
parameters (15-16) and constraints on the components^ parameters, which must 
be satisfied for those patterns (17-18, 22-26). The constraints in this example 
define that a circle has a maximum radius of 30 units^ and that arrows have to 
start and end at circle centers. 

The grammar of the hypergraph model is specified by the type of the starting 
hyperedge (30) and by grammar productions. For trees, context-free productions 
are sufficient. Each production has a nonterminal hyperedge on the LHS and an 
arbitrary hypergraph of terminal and nonterminal hyperedges on its RHS. The 
parser is similar to the Co eke- Younger- Kasami parser for textual languages and 
thus actually requires a grammar in Chomsky normal-form (CNF) [1], i.e., each 
production's RHS has to consist either of a single terminal or of two nonterminal 
hyperedges. DiaGen'^s generator transforms the specified grammar into CNF. 
Since this transformation is crucial for parsing efficiency, hints can be given to 
the generator. E.g., the fourth production (48-53) uses brackets to give a hint 
how to split up the original RHS. The is a further efficiency improving hint 
to the generator.^ Line 49 is an application condition for this production. The 
condition on the nodes^ positions defines an ordering on the tree nodes to make 
the grammar unambiguous. 

^ The default unit is a screen point when using a zoom factor of 1. 

^ As a default behavior, the generated parser deliberately disregards the gluing condi- 
tion in order be able to deal with (partially) incorrect diagrams. However, this may 
result in a less efficient parser. Efficiency can then be improved by forcing the parser 
to obey the gluing condition for selected productions which are marked by 
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Fig* 3* Two sample editors which have been generated with DiaGen: A tree editor and 
an editor for the visnal A-calcnlns VEX [2]. 



Grammar productions optionally carry evaluation rules and constraints. At- 
tribute access has been inspired by YacC: the LHS hyperedge is referred to by 
$$, the RHS hyperedges by $0, $1, etc. E.g., $$.root (line 32) means attribute 
root of the LHS. Evaluation rules are always value assignments where values 
are computed by user code. It is this user code which is the part of the semantic 
analysis which is not automatically created by the generator. The constraint 
solver engine, which has been specified in line 11, restricts the set of possible 
constraints here: QOCA, which is written in Java and which is thus as portable 
as the rest of the editor, only supports linear constraints, Parcon additionally 
allows some non-linear constraints. But unfortunately, Parcon is only available 
under Solaris and Linux^. 

4 Editor Usage 

Figure 3 shows two screenshots of the current (preliminary) editor user interface 
(UI): It consists of a drawing canvas and a toolbar. Using either the toolbar or 
popup-menus, the user can place diagram components on the canvas. V^hen a 
component is selected, it creates ^handle$\ These are UI objects whose position 
is linked to the component's geometric attributes (which in turn define the visual 
appearance). The component can thus be moved or reshaped by dragging the 
handles. 

V^hen a component is modified, the constraint solving mechanism tries to 
preserve the syntactic and semantic structure of the diagram by adjusting the 
geometric attributes of other components. E.g., if a node is moved in the tree 
editor, its connections are adjusted and, if necessary, entire subtrees are moved to 
preserve the vertical node ordering. This “intelligent” mode can be switched off 
if the user wants to execute a modification that changes the diagram's syntactic 
structure, e.g., moving a subtree from one node to another. 

^ Due to the inefficiency of the current Java 2 implementation under Linux, Parcon 
is not really usable by DiaGen on Linux. 
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The recognized diagram structure is indicated by coloring the components: 
A substructure that could parsed correctly (or an entire correct diagram) is 
displayed in blue-green shades while incorrect diagram parts remain black. 

In addition, the editor currently provides the following features: 

- Selection of multiple components, cut & paste operations. 

- Saving and loading of diagrams. 

- Displaying and editing at arbitrary zoom factors. 

- Multiple editor windows for the same diagram. 

5 Conclusions 

The paper has given a brief outline of the current state of DiaGeUj a spec- 
ification method and generator for diagram editors, and how diagram ed- 
itors are generated with this tool. The tool and sample editors (e.g., the 
tree editor, which has been used in this paper) are available on the web 
(http://www2 . infomatik.uni-erlangen.de/DiaGen) . 

DiaGen can only generate free-hand editors so far. Current work adds syntax- 
directed editing as an additional editing mode. DiaGen is furthermore going 
to generate diagram editors that are not stand-alone programs, but software 
components conformable with the JavaBeans standard. Hopefully, users will 
then be able to easily customize generated diagram editors and integrate them 
into larger systems. 
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Abstract. PROGRES is the first strongly typed graph rewriting lan- 
guage. Its programming environment supports the specification, rapid 
prototyping and implementation of abstract data types with graph-like 
internal data structures. The main application area for this language is 
the development of complex internal data structures for software devel- 
opment tools and environments. In this article we present the PROGRES 
system, parts of its functionality and a rapid prototype generated of a 
PROGRES specification. 



1 Introduction 

Although the idea of using graph transformations for specification and very high 
level programming purposes is 30 years old, tools which support graph rewriting 
techniques have been developed rather recently. Many people believed that 
modelling with graphs and graph rewriting systems leads to inherently inefficient 
implementations clue to the NP- completeness of many graph algorithms. This 
situation changed gradually with the appearance of first graph rewriting systems 
or graph grammar implementations like GraphEcl [3], PAGG [2], and GOOD 
[6]. In this article we will present the PROgrammed Graph REwriting System 
PROGRES [7], which is available as free software. PROGRES has been used in 
many projects with industrial partners such as the Aachen- Munchener Insurance 
Company, Alcatel, Bayer AG, Springer publishing, Ericsson, for specifying tools 
and data structures of integrated software engineering environments, describing 
process modelling, version control, and configuration management tools in CIM 
environments, and many more. 

An integrated environment has been developed for the PROGRES language. 
It comprises a hybrid textual/ graphical syntax- directed editor, a variety of in- 
tegrated analysis and browsing tools, and an interpreter together with compiler 
backends for C unci Modula-2. The design and the realization of this environ- 
ment is based on our experiences with the construction of integrated software 
development tools during the IPSEN project [5]. Attributed graphs are used 
as underlying data structures for modelling and implementing internal data 
structures. The language PROGRES supports the definition of the corresponding 
graph schemes and the description of graph transformation operations. 

Using the integrated C front-end compiler stand-alone application proto- 
types can be generated. They allow to hide the complex specification and the 

M. Nagl, A. Schiirr, and M. Munch (Eds.): AGTIVE’99, LNCS 1779, pp. 441-448, 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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development environment from the user and to provide him with a simple but 
adaptable Tcl/Tk frontend. In this way the specification can be executed and 
validated without knowledge about graph rewriting systems by ”end users” of 
the specified system. 

In the next sections we will present the PRO GRES environment, begin- 
ning with the syntax-directed editor for creating graph schemes and graph 
manipulation operations. Section 3 outlines the manyfold analyzing opportu- 
nities PRO GRES offers its users. In section 4 we show the functionality of the 
integrated interpreter. Einally. in section 5 we present a rapid prototype which 
has been generated of a PRO GRES specification. 



2 The PROGRES Editor 



The PROGRES environment consists of three main tools: an editing took an 
analzing tool and an interpreter with integrated C compiler. In this section we 
focus on the first part of these tools, the syntax- directed editor. 

A specification written with PROGRES always consists of two parts. The 
first part describes the graph scheme and the second part contains all graph 
manipulation operations. Eig. 1 shows the graphical editor for developing a graph 
scheme. 




At the right side of this window the user can find the menu which shows the 
currently applicable editing commands. In this case we can create a new node 
type {N odeTypeDecl).^ a new node class {N odeClassDecl).^ or a new section 
{Sect i on ) . The menu ent r y Ge n e r i c comp r i se s fun ct i on s li ke cut . copy Hz p as t e . 
and so on. The workspace on the left-hand side shows already defined node 
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classes and node types. Node classes can be seen as abstract classes like in 
00 modelling languages which comprise common properties of certain node 
types. Node classes are displayed as normal rectangles. Node types are concrete 
( instant! able) classes in terms of 00 modelling languages. Node types are shown 
as rounded rectangles. In fig. 1 we have defined two node types. Resource and 
Task. Furthermore, a number of relationships between node types and node 
classes can be seen. Inheritance dependencies between node types and node 
classes are denoted by arrows with hollow heads. Binary relations between node 
classes or node types are expressed by arrows with an open head (e.g. For and 
Free). This notation is similar to a U ML- like syntax. Node class or node type 
definitions may also contain attribute definitions. These attributes can be of 
any type (i.e. integer, string, boolean, node class, node type, or any self- defined 
type imported from a C library). There are currently four kinds of attributes 
available: 

— intrinsic attributes: Any value which obeys the type definition can be as- 
signed to this attribute. 

— derived attributes: The value of these attributes is not assigned directly but 
computed via an evaluation function. 

— meta attributes: The values of these attributes are determined at design 
time and are considered being constant during the execution (run-time) of a 
specification. With the help of meta attributes it is possible to define generic 
classes (class templates). 

— constraint attributes: These attributes are syntactically similar to derived 
attributes but the values may only be of boolean type. Every constraint 
attribute at instances of nodes in the host graph has to evaluate to true at 
any time. 

In our example in fig. 1 we want to build a simple task net editor which is 
able to create and manipulate a GANT chart. You can find several definitions 
of intrinsic and derived attributes in the graph scheme. The node type Task 
defines five derived attributes which indicate e.g. the earliest possible starting 
time of a task, the latest possible ending time etc. These values are all computed 
by evaluation functions, so their value is determined implicitly. 

Having defined the graph scheme we can now specify the graph manipulation 
operations. E.g. we want to be able to construct such GANT charts, manipulate 
them and do some simple queries on this chart, such as finding the critical path^ 
of the project etc. For these purposes PROGRES offers mainly six different 
specification constructs: Productions, Paths, Restrictions, Tests, Transactions, 
and Queries. 

Productions are graphically notated rules that describe graph transforma- 
tions. Every production has a left-hand side (LHS) and a right-hand side (RHS). 
The graph pattern which can be specified on the LHS of the production is the 

^ The critical path is a path through the GANT chart from the beginning to the end 
of the project, which determines the duration of the project. If any task on this path 
is delayed, the whole project will be affected. 
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subgraph which has to be matched to the host graph and will be replaced by 
the graph modelled by the RHS of the production. Fig. 2 shows an example for 
such a production. This production specifies a solution for the problem if one 
resource is assigned to two overlapping tasks although there is a second similar 
resource which could do the job and has not been assigned to a task yet. On 
the LHS (the upper part) of the production we made use of variables submitted 
as parameters to the production for narrowing clown the number of pattern 
matches (the notation -\ = Jl indicates that). Nodes which are denoted by e.g. 
'4 : Resource can be matched to any node of the type '' Resource” . Furthermore, 
we have used paths (double arrows) and simple edges (single arrows). Crossed 
out edges or nodes mean that it is explicitly enforced that this edge or node 
may not exist in the subgraph match. If such a graph pattern can be found in 
the host graph fulfilling all specified conditions it will be replaced by the graph 
pattern of the RHS. The RHS refers only to nodes of the LHS. Neither any new 
node will be created nor any other node will be deleted from the graph. The 
only modification is a deletion and a creation of an edge. 




Fig. 2. A production in PRO GRES is a graphically notated graph transformation 

Paths as shown in fig. 2 are derived relationships between nodes. Graphically, 
they are denoted as a double arrow. Paths can express complex navigation 
operations where nodes have to obey certain conditions etc. Paths are a very 
convenient way to express complicated relations between (sets of) nodes. 

Similar to paths are restrictions. Restrictions can be seen as a condition 
a node has to fulfill. These conditions can be of arbitrary complexity so that 
restrictions are as convenient in use as paths are. Graphically, they are denoted 
like paths with just one node involved, the target node, i.e. the node the head 
of the arrow is pointing to. 
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Paths and restrictions are mostly used in LHS of productions or tests. In 
PROGRES a test defines a graph pattern just as the LHS of a production. 
This pattern will be matched to the host graph. A test is said to be executed 
successfully if such a match can be found. Tests are useful for finding certain 
subgraphs, (sets of) nodes in these subgraphs etc. 

The last two modelling means are not denoted graphically but t ext u ally: 
Transactions and queries. Transactions are complex graph manipulation oper- 
ations which provide control structures to the user such as deterministic and 
non-cleterministic concatenations of operations, loops, conditional statements 
etc. Queries offer the same control structures as transactions but are limited 
to operations which do not modify the host graph. Therefore, queries can be 
regarded as complex graph tests. 

In the next section we will briefly explain the analyzing tool which supports 
the PROGRES user at design time of a specification by finding type mismatches, 
syntax errors etc. After that we will show how a specification can be executed. 



3 The PROGRES Analyzing Tool 

The PROGRES system supports the user who writes a specification by various 
means. One of the most powerful integrated tools is the analyzer. The analyzing 
tool is capable of checking the syntax, type compatibilities, variable scopes etc. 
More than 300 rules are currently implemented for checking a specification w.r.t. 
the language's static semantics. In order to be able to help the PROGRES user to 
avoid. And. and eliminate errors we have implemented an incrementally working 
analyzer, a number of browsing commands, and even a very first version of an 
automatically working error elimination tool which is able to solve conflicting 
attribute definitions w.r.t. inheritance. 




Eig. 3 shows an example of an error message for an error found by the 
PROGRES analyzer. VVe have changed a path name of the production shown in 
fig. 2. Of course, we have made a typo there. The specification does not contain 
a definition of the path overlapping. The analyzing tool has found this mistake 
and shows an error message in a separate window. Eor some errors (i.e. not so 
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obvious ones we have produced here) PRO GRES gives some suggestions how 
this error could have occurred and how to eliminate it. 

The analyzing tool can be switched on and off. If it is switched on editing 
large specifications can be very time consuming. However, before the user wants 
to start the interpreter the analyzer has to check that the specification is error- 
free. 

4 The Interpreter 

The built-in interpreter of the PRO GRES system is able to execute a speci- 
fication. The execution of a specification is started by creating an interpreter 
session. An interpreter session can be created, suspended, saved, and resumed 
again. A new view will pop up. This view will be used to inspect (implicit and 
explicit) variables of a specification during the execution of the specification. 

The creation of a new interpreter session includes several initialization tasks 
such as creating a new empty host graph and the compilation of the specifica- 
tion's graph scheme into an internal format (for the underlying database system 

gras'" [4]). 

The current point of execution is marked with a gray background. Initially, 
this is the transaction Main. This transaction has to be present just as in every 
C style program. The PRO GRES system offers several commands to execute a 
specification either step by step, on larger increments, or running to a marked 
breakpoint etc. It is also possible to step into operations so that PRO GRES 
supports its users in debugging specifications considerably. 

PRO GRES has also a built-in host graph browsing tool which can be launched 
as soon as an interpreter session exists. In a new window the current host graph 
will be shown. This tool offers a set of possibilities to adjust the appearance of 
nodes, edges, and attributes. Also, their visibility can be set. even according to 
some attribute values. 

Eurthermore. it is possible to translate the internal code the interpreter works 
with to C code. The user can view the (pretty printed) C code for each part or the 
whole of an operation separately when an interpreter session has been started. 
However, the PRO GRES system also offers to generate C code for the whole 
specification into a user- defined directory. This C code can be translated by 
any C compiler separately. With the help of the database system GRAS and a 
prototyping framework based on the Tcl/Tk package the translated C code forms 
a stand-alone application implementing the functionality of the specification 
written in PRO GRES. This prototyping framework is presented in the next 
section. 

5 Generating Rapid Prototypes 

As mentioned in the previous section it is possible to generate C code of a 
PRO GRES specification. This code can be used for building a stand-alone 
application, a prototype of the specification. This prototype gives the user the 
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opportunity to execute PROGRES specifications together with an interactive 
standard user interface. This leads to a faster execution of the specification. 

At the start of the application two windows are opened: the graph browser 
window with the main menu and a variable view window. However, the latter will 
be opened iconified. We will concentrate on the browser window in the following 
paragraphs. Fig. 4 shows an example of the generated prototype of our example. 
The browser already shows a graph representing a GANT chart for some project. 
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Fig. 4. The application prototype showing a GANT chart 



This prototype offers its users not only the scaling and rotating of the graph 
but also different layout algorithms, the same customization facilities as the 
graph browser of the interpreter tool (i.e. visibility of nodes and edges, their 
representation. ...). and configurable menus. A simple text file which can be 
edited by the user determines the menu structure of the application. However, 
this is limited to the PROGRES operations the application prototype offers, i.e. 
transactions, productions, tests and queries. These menus are also detachable so 
that they appear in a separate window as shown in fig. 4. 

The graph which has been built up by calling PROGRES operations success- 
fully can be saved as a text file. This text file may be used for backup purposes 
or sending to other people who use the same application prototype. The text file 
can be parsed into the application and further operations may be carried out on 
this graph. 

The prototype framework supports full undo/redo- steps as the PROGRES 
system offers its users as well. However, after having parsed a graph from a text 
file the history of this graph is lost. 

We are aware that this prototype environment is far away from being useful 
to build industrial applications but it fulfills its task for evaluating specifica- 
tions and also teaching purposes. Please refer to the chapter 'WP GRADE - 
A Framework for Graph-Based Applications” by D. Jager which describes a 
new implementation of a more flexible and more professional looking prototype 
framework. 

6 Conclusion 

PROGRES is a powerful tool for specifying graph grammars and graph rewriting 
systems. The user of the system is supported in many ways. First, the editing 
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of a specification is made easy by a syntax- directed editor. With the help of 
this editor a lot of syntactical errors can be avoided at design time already. 
The static semantics of a specification is checked by an incrementally working 
analyzing tool which checks the validity of more than 300 rules. An integrated 
interpreter allows the execution of a specification. The interpreter supports full 
undo/redo- steps and intertwined editing and execution without restarting the 
whole process of interpretation. Finally, it is possible to generate C code from 
a specification. With that, the user is able to build a stand-alone application 
prototype with the functionality of the specification. 

The PRO GRES system has been used in many projects already. In some 
projects industrial partners such as Aachen-Munchener Insurance. Alcatel. Er- 
icsson and many more have been involved. Descriptions of academic projects 
can be found in this LNCS volume written by K. Cremer. A. Racier macher. 
A. Schleicher, and D. Jager and B. Westfechtel. 

Recent research on the PRO GRES system and the PROGRES language is 
going on in the area of implementing a new prototype framework and further 
extensions of the PROGRES language. Currently we are integrating a module 
concept into the language and investigating concepts for extending PROGRES 
by the object-oriented programming paradigm. 

For further information refer to the PROGRES homepage: 

http://www-i3.informatik.rwth-aachen.de/research /progres 
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Abstract. The Fujaba environment provides means for the specification of 
software systems in UML notation and it has the opportunity to simultate the 
specified applications beforehand. Therefore, Fujaba provides editors for 
UML class diagrams for the static aspects of a software system and it provi- 
des Story Diagrams for the specification of dynamic behaviour. Story Dia- 
grams combine UML activity diagrams for control flows and an UML colla- 
boration diagram like notation for graph rewrite rules. Statecharts can be 
used for the specification of reactive objects. In Fujaba, each diagram has a 
precise formal semantics and this enables us to generate Java code from the 
specification. The generated Java code is executed in a Java Virtual Machine 
(JVM) and can be visualized by an integrated object browser. This paper 
shows in a tool demonstration how to use the Fujaba environment in order to 
simulate a specification of a shuttle based production control system. 

1 Introduction 

This paper presents in a tool demonstration how the Fujaba^ environment [FNT98] can 
be used to specify e.g. production control systems. Fujaba provides (1 .) UML class dia- 
grams, [BRJ99], for the specification of static parts of the system, (2.) a combination of 
UML activity and UML collaboration diagrams with an underlying graph grammar se- 
mantics for the specification of dynamic parts of software systems, and (3.) statecharts 
[Har84] for the specification of reactive objects and (4.) SDL [ITU96] block diagrams 
for the specification of asynchronous messages. From such a specification, Fujaba is 
able to generate 100% pure Java code. The integrated Dynamic Object Browsing Sy- 
stem (DOBS) allows a simulation of the generated code. Such a simulation enables us 
to test the specification in order to validate its behaviour. 

2 Fujaba, an overview 

Fujaba has been developed since 1998. Fujaba is planned as a round-trip and reverse 
engineering tool, that provides editors, code generators and recognizers for all types of 
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UML diagrams. Figure 1 shows a screen shot of the Fujaba environment that contains 
the class diagram of the production control system example. 




Figure 1 Fujaba class diagram screen shot 

The class diagram shows a simple example specification of a production control sys- 
tem. Class Factory serves as a root class for the example and models the whole factory. 
The production line consists of fields represented by class Field. The self-association 
next connects fields to tracks for shuttles. Shuttles moving on fields are modeled by 
class Shuttle. Shuttles are able to stand at a certain field (association at to class Field) 
and to carry some good (association carries to class Good). The goods will be produced 
at assembly lines in the factory. Classes can contain attributes, e.g. the string typed va- 
riable wantedGood of class Shuttle, or methods and signals. In a class diagram, methods 
are only method declarations and method bodies can be specified using Story Diagrams. 
Fujaba combines UML activity diagrams and graph rewrite rules for the specification 
of method bodies. The control flow of methods is specified via UML activity diagrams, 
where each activity can contain either Java source code as well as graph rewrite rules. 
Figure 2 shows a screen dump of Fujaba with the body of the go method of class Shuttle. 
The control flow of the method starts at the filled circle on the top of the diagram and 
follows the transitions. The first activity contains a graph rewrite rule. If the execution 
of the graph rewrite rule is successful, the control flow follows the transitions guarded 
with success and the method reaches the end-symbol. In case of a non-successful exe- 
cution of the graph rewrite rule, the variable blocked is set to true and the method ends. 
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Graph rewrite rules in Fujaba show the left- and right-hand sides of a rule in a single 
picture and modifications are specified explicitly. We made the experience that typical- 
ly rules look-up relatively large object (graph) structures, but contain only some modi- 
fications and so the single-picture notation is more compact and easier to read. 




Figure 2 Story Diagram for method go of class shuttle 

The execution of the graph rewrite rule works as follows: First bind each variable in the 
rule to an object in the object structure and then execute the modifications. Fujaba 
traverses the object structure starting from already bound objects and tries to bind the 
unbound objects of the rule. In Figure 2, the this object is already bound (the variable 
has no type) and thus, Fujaba first tries to bind the field object f1 by traversing the at 
link between this and f1 . The next variable to be bound is f2, which is also a field. The 
cross-out of the at link between f2 and the variable other of type Shuttle specifies that 
there must not be another shuttle at field f2. Now, the modifications, specified in the 
rule, are executed, which means that the at link between this and f1 is deleted (specified 
by the two parallel, ’red’ lines) and a new at link is created between this and f2 (speci- 
fied by a series of ’green’ -i- symbols). Finally, the collaboration statement 
”1: blocked := false” is executed. 

3 The dynamic behaviour of shuttles 

To specify the dynamic behaviour of reactive objects, Fujaba uses statecharts. State- 
charts are assigned to classes, e.g. Figure 3 shows the statechart of class Shuttle. The 
notation is taken from the original statecharts introduced by Harel [Har84] and com- 
bined with graph rewrite rules. 
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The top-level states of a shuttle are waiting and active. After creation each shuttle is in 
state waiting and if an assign signal is received, the shuttle goes in the complex state 
active. The complex state active contains four different states, namely fetch, produce, 
sleeping and deliver. Each state represents a step in the production process of the sample 
factory. Like in Harel statecharts, each state consists of an entry-, a do-, and an exit-ac- 
tion. Guards, signals, and actions connected to transitions are notated as 
"signal [guard] / action". The execution sequence of actions is similar to Harel state- 
charts. 




However, note that in our approach a statechart controls only a single reactive object, 
e.g. a single shuttle. Multiple reactive objects communicate with each other via asyn- 
chronous messages. This communication is modeled using SDL block diagrams. 

State fetch of class Shuttle models that a shuttle has to go to a field named "Entry" and 
get some piece of iron. This behaviour is specified in the inline do- action, which is a 
graph rewrite rule. If the shuttle, specified by the this object, is currently not at a field 
named "Entry" the execution halts until a reached signal is received or a timer of ten 
seconds runs out. The latter is specified by the "after 10000 / go()" transition. The tran- 
sition-action go lets the shuttle move to the next field (c.f. Eigure 2) and the graph re- 
write rule is executed again. Once the shuttle is at an "Entry" field a new Good object 
is created and put on the shuttle. After successful execution the shuttle sends itself a re- 
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ached signal, which is queued and after finishing the do-action the shuttle goes into 
state produce. The produce state checks if the shuttle is at a field with an assembly line. 
If not, the shuttle moves to the next field after ten seconds as in state fetch. When the 
shuttle reaches a field with an assembly line, it sends itself a reached signal and a pro- 
duce signal to the assembly. After that, it goes into state sleeping and waits for a goOn 
signal. Once the shuttle receives a goOn signal from the assembly line, it switches to 
state deliver which models the delivery of the manufactured good. After delivery, the 
shuttle reaches state fetch again and the production process starts, anew. 

4 Testing the specification 

From the specified class diagrams, Story-Diagrams, and statecharts, the Fujaba code 
generator produces automatically 100% pure Java code. For example, classes are 
mapped to Java classes with attributes and method declarations. Story-Diagrams define 
the method bodies and the event handling of each class, which is specified by a 
statechart. Statecharts are implemented using threads in order to be executed concur- 
rently. For more details of the code generation for class diagrams and Story-Diagrams 
see [FNT98, FNTZ99] and the mapping to threads is explained in [Koe99, KNNZ99] 
in detail. The concurrency control concepts are taken from [Lea97]. 




Figure 4 Current factory configuration 

A sample factory configuration is shown in Figure 4. The screen shot is taken from the 
dynamic object browsing systems (DOBS) of Fujaba and shows the current object 




454 Jorg Niere and Albert Zundorf 



structure within the Java Virtual Machine (JVM). There is a factory in the upper left and 
an assemby line in the upper right comer. The assembly line is standing at field f2 and 
the production line is currently a circle of four fields f2 to f5, which are connected via 
next/p rev links. On the production line there are currently two shuttles s6 on field f4, 
and s7 on field f5. 



DOBS allows to display attributes of objects based on the Java mntime type information 
and reflection mechanisms, for example the current attribute wantedGood has as value 
the string "clock”. To provide a more convenient representation, DOBS can display dif- 
ferent icons for objects. Therefore, DOBS uses the Java reflection mechanisms to ask 
an object, which icons shall be displayed. E.g. the icon of shuttle s7 shows only the 
shuttle itself, in contrast to shuttle s6, where a piece of iron is lying on the shuttle. This 
visualisation is more expressive than to show a good object with a carries link to the 
shuttle like it is modeled in the class diagram (see Figure 1). 

The current factory configuration shall work as follows. Shuttles go to the field named 
"Entry" (field f4), pick up a piece of iron and then go to the assembly line. The assembly 
line produces a specific good out of the iron, e.g. a key or a lock and after that, the shut- 
tle goes to a field named "Deliver", where the good is taken from the shuttle and stored 
(the storage is not modeled here). 



To test this sample factory, DOBS allows the user to invoke methods on certain objects. 
For example the popup menu in Figure 4 shows that the go method of shuttle s6 is in- 
voked by the user. So, for testing this specification, the user is e.g. able to let the shuttles 
do their job using the method invocation concept. 




s7 : Shuttle 



^wantedGood: String = clock 
Estate: String = waiting 
^blocked: Boolean = true 




f5 : Field 



rename: String = 



I ^wantedGood: String = clock 
Estate: String = waiting 
^blocked: Boolean = false 




f3 : Field 



f!>name: String = Deliver 



s8 : Shuttle 



^wantedGood: String = clock 
Estate: String = waiting 
^blocked: Boolean = false 




Figure 5 Assembly line producing a good 
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Such a test, where the user has to invoke different methods or doing layout, is very la- 
bor-intensive and the control parts rely on the user. But, reactive objects work autono- 
mously and may be simulated in a continous way and not step by step. Therefore, the 
user just sends assign events to the shuttles. This turns the shuttle threads into state ac- 
tive and they start to execute the production process autonomously and concurrently. 
Figure 5 shows the main content of the DOBS window (see Figure 4) with the sample 
factory. The threads are just started and the shuttles are moving around. Shuttle s6 is 
currently at the assembly line, where a clock is produced out of the piece of iron the 
shuttles carrying in Figure 4. The stop sign in the icon of shuttle s7 says that the shuttle 
could not go on to the next field, because shuttle s6 is currently there, and so shuttle s7 
is blocked^ (see also method go of class Shuttle in Figure 2). 

Now, there might be the need to reconfigure the production line, because the demands 
have changed or the production line is not as productive as it could be. Such reconfigu- 
rations can be done by the developer within the running simulation. For example, the 
production line can be extended by new fields and to raise the productivity an additional 
shuttle s8 might be installed. To facilitate reconfigurations, the system could be halted, 
but this is not necessary. Using a simulation, bugs in the specification can be pointed 
out, e.g. whether shuttles might crash or may block each other. But not only bugs in the 
specification can be pointed out, but also configuration problems. For example, if shutt- 
les are mostly blocked, because the assembly line is too slow, the option of buying a 
second assembly line can be analysed by a simulation or other more optimal configura- 
tions can be discussed. 

5 Conclusions and perspectives 

We have shown, how the Fujaba environment can be used in order to specify production 
control systems and to test the specifications through simulation. The main problem is 
that modern production systems underly frequent changes and reconfiguration of hard- 
ware and software takes much time, because the software can’t be tested up front. We 
presented a possible solution to overcome these main problems by using the Fujaba en- 
vironment. Fujaba has either the opportunity to edit a specification for a production con- 
trol system as well as the generation of Java code and the simulation using DOBS. 
Future work is to improve the simulation features of DOBS. For example, we need a 
script language for the configuration of DOBS, where parts of the configuration can be 
placed directly in the specification. Attributes shall be assigned with drawing objects, 
where e.g. a lamp attribute is displayed as a lamp and not as a text which says true or 
false. Another current project is a topology editor. E.g. a track-based production system 
may be plugged together in an editor offering track icons. This topology will be trans- 
lated into graph rewrite rules to create initial starting object structures for the simulation 
afterwards. 
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Abstract. L-studio/cpfg is a plant modeling software system designed 
for Windows 95/98/NT platforms. Its key components are the L-system- 
based plant simulator cpfg and the modeling environment called L-studio. 
We overview version 1.0 of this system from the user^s perspective. 



1 lotrodoction 

L"Studio/cpfg is a software system for simulating plant development and visu- 
alizing plant architecture. It has been developed at and is distributed by the 
University of Calgary. Its current and prospective applications include: 

— modeling" and simulation-assisted research in botany, ecology, and applied 
plant sciences (horticulture, agriculture, and forestry), 

— synthesis of plant images for artistic and entertainment purposes {e,g, com- 
puter animation) and for computer-assisted landscape design, and 

— experimentation with algorithms for generating patterns and fractals. 

Version 1.0 of L-studio/cpfg consists of: 

— the L-sy stem-based simulation program cpfg (plant and fractal generator 
with continuous parameters), 

— the L-studio modeling environment that provides auxiliary modeling tools 
and confers a Windows- style graphical user interface on the entire system, 

— a library of programs for simulating environmental processes that affect plant 
development, and 

— a set of sample models. 

The L-studio environment is the only component of the package designed 
specifically for the Windows 95/98/NT platforms, and has a UNIX counterpart 
called the Virtual Laboratory [1, 4]. The remaining components have been de- 
signed to be platform-independent. Consequently, cpfg and its related software 
can be used on both Windows and UNIX (Silicon Graphics) machines. This 
software portability has been achieved through the use of a widely available 
programming language (C) and the standard OpenGL graphics library [8]. 

A sample screen depicting L-studio/cpfg in operation is shown in Figure 1. 
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fscanf(fp, "%lf, &int_len_1[i]); 3 

i=i+1 ; 

> 

fclose(fp) ; 



End: {printf ("Simulation completedXn") ; } 
derivation length: STEPS 

fixiom: INIT_POS?(D_INIT)F(7O)[/+(20,O)/(9O)fi(O,O)] 
[-(20,0)/(90)fl(0,0)] 

I* 0 ( 0 , k) represents an apex 

0 - branch order (main axis has order 0), 

k - internode number along the axis 



fl(o,k) : (o == 0 && k < MftX_0) — > F(int_len_0[k]) 
[t(D_INIT)_(E_INIT)+(B_ftNG)/(90)ft(o+1,k)] 
[?(D_INIT)_(E_INIT)-(B_ftNG)/(90)fl(o+1,k)] 
/(TWIST) *(D_INIT)_(E_INIT)ft(0,k+1) 
fl(o.k) : (o == 0 && k == MflX_0) — > B 

fl(o.k) : (o == 1 && k < MflX_1) — > F(int_len_1[k]) 
[♦(D_INIT)_(E_INIT)+(B_flNG)/(90)fi(o+1,k)] 
[♦(D_INIT)_(E_INIT)-(B_flNG)/(90)fl(o+1,k)] 

jlJ I iT 



L L-studio V. 1.0 - [lilac-congo] H^E3 



Projecl Cpfg Window Help - Ifll x| 

L-System jview ) Animate) Colors | Surfaces] Contours] Functions] Text file] 

plant.! I 






Fig. 1. A snapshot of the L-studio/cpfg screen. A plant model is shown in the main 
cpfg window. An anxiliary cpfg window underneath displays output and error messages. 
The L" studio window to the right makes it possible to edit the model. In this example^ 
a text editor is open on the text file that specifies the main characteristics of the model 
in the L-system-based modeling language of cpfg. 



2 The cpfg modeling program 

The simulation program cpfg lies at the heart of L-studio. The design of cpfg 
has been guided by two key objectives: 

— flexibility, making it possible to model and simulate a wide range of struc- 
tures and developmental processes in plants, and 

- visual realism of the models. 

To meet the flexibility criterion, the models are specified by the users in the 
L-system-based cpfg modeling language [11] (Figure 1, right). The original for- 
malism of L-systems [3] has been extended with mechanisms needed to simulate 
the interaction of plants with their environment, such as responses to pruning 
and competition for light and water [7, 12]. 

Based on its input, cpfg creates a three-dimensional internal representation 
of the model and projects it on the screen (Figure 1, left). Model visualization 
is based on the graphical interpretation of L-systems using turtle geometry [13], 
extended with several standard modeling and rendering techniques developed 
in computer graphics, such as parametric surfaces, generalized cylinders, and 
texture mapping [2]. The output of cpfg has the form of static models (which 
can be interactively rotated and zoomed in by the user) and computer- generated 
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From [10]. 



animations that result from the visualization of consecutive stages of the simu- 
lation (Figure 2). The generated images can be output in several raster formats 
and as PostScript files. The 3D models can be exported to external rendering 
and modeling programs using rayshade and SGI Inventor scene description file 
formats [5]. Moreover, user-specified quantities can be output to a file, allowing 
for further statistical analysis of the models [11]. 

3 The L-stodio environment 

In addition to the L-system- based description of the essential aspects of the mod- 
els, cpfg requires information to control the viewing and animation processes, and 
characterize visual attributes of the models. The L-studio environment provides 
a user interface for specifying this complementary information and transferring 
it to cpfg as a set of files which, taken together with the L-system file, constitute 
a model. 



L-Svstem | View j Animate j Colors] Surfaces] Contours] Functions] Text file] 



Fig. 3. The L~studio tabs 



The graphical interface is organized according to the Microsoft MDI (Multiple 
Document Interface) standard [9]. The L-studio window is divided into sections 
identified by tabs (Figure 3). Every tab is associated with an editor. Some editors 
are complemented with galleries that make it possible to select a specific object 
or feature to be edited within a given class (for example, the shape of a petal 
within the class of organ shapes). A short description of the tabs and editors 
follows. 

L-system opens a text editor on the modePs L-system file (Figure 1). 

View opens a text editor on the file that specifies rendering attributes of the 
model (for example, the shading mode and parameters of light sources). 
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i L-studio V. 1.0 - [lilac] 



13 Project Cpfg Window 



Help X] 





Fig. 4. The L"Studio surface editor. The surface control points can be manipulated 
directly in the snbwindow that displays the surface ^ or using numerical controls under- 
neath. The gallery at the bottom makes it possible to select the surface to be edited. 



Animate opens a form-based editor of the file that controls the cpfg animation. 
For example, this file specifies the time interval between frames and the 
numbers of the first and last frame to be shown. 

Colors provides access to two graphical editors that define color aspects of the 
models generated by cpfg. The color map editor is used to directly define 
colors of model components generated by cpfg, which then operates in color 
map mode [2]. The material editor makes it possible to define shading pa- 
rameters employed by cpfg in more realistic rendering, using the OpenGL 
lighting model [8]. 

Surfaces opens a graphical editor of bicubic (Bezier) surfaces [2], which can be 
incorporated into cpfg models to represent plant organs such as leaves and 
petals (Figure 4). 
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Contours opens a graphical editor of planar curves (B-splines), considered by 
cpfg as cross-sections of generalized cylinders [2]. Generalized cylinders with 
closed cross-sections are a convenient representation of stems, and gener- 
alized cylinders with open cross-sections provide an alternative to Bezier 
surfaces for representing leaves and petals. 

Functions opens a graphical editor of functions of a single variable. The func- 
tion editor is similar to the contour editor, except that the edited B-spline 
curves are constrained to assign exactly one y to every x value in the normal- 
ized function domain [0, 1]. Graphically-specified functions provide a flexible 
mechanism for defining and manipulating many aspects of the model, for 
example growth functions of model components. 

Text makes it possible to open the text editor on any other user-defined text 
file associated with the model. For example, such files may provide model 
description or specify the interface between the model of a plant and the 
model of its environment, as discussed in the next section. 



4 The environmental programs 

An important application area of architectural plant modeling is the study of in- 
teractions between plants and their environment. L-studio/cpfg makes it possible 
to simulate such interaction using concurrent communicating processes to rep- 
resent the plant and its environment (Figure 5). The plant model is expressed 
using the formalism of open L-systems [7], which includes a construct for ex- 
changing information with the environment. Cpfg sends the information from 
the plant model to an environmental program and receives the environmental 
response, thus creating a feedback loop of information exchange. 




Fig. 5. Software organization for modeling plants that interact with their environment. 
Shaded areas indicate components of L-stndio, clear areas indicate programs and data 
that may be defined or modified by the user. Cpfg communicates with the model of 
the environment by exchanging messages in a standardized format, supported by the 
L-studio/cpfg communication library. The content of these messages is specified by the 
user. From [7]. 
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Several programs for simulating environmentally-mediated phenomena have 
been developed for experimental purposes [6]. Version 1.0 of L-studio/cpfg in- 
cludes programs that simulate basic phenomena: collisions between plants and 
plant organs, the effects of shading on the distribution of light from the sky 
hemisphere, and the diffusive transport of water in the soil. A library of com- 
munication functions and the C source code of sample programs facilitate the 
creation of new environmental programs by the user. The specification of environ- 
mental programs in C or C++ is admittedly less convenient than the definition 
of plant models in the cpfg modeling language, but no special-purpose high-level 
language for defining the multitude of possible environmental processes currently 
exists. 



5 S amp le mo de Is 

L-studio/cpfg 1.0 is distributed with a set of models that illustrate various fea- 
tures of cpfg, its application areas, and modeling styles. The models are grouped 
into the following classes: 

— fundamentals of plant modeling using L- systems, 

— simulation of developmental processes controlled by lineage (context-free L- 
systems), 

— simulation of developmental processes controlled by signals or the allocation 
of resources (context-sensitive L-systems), 

— simulation of plants interacting with their environment (environmentally- 
sensitive and open L-systems), 

— individual-based simulation of plant ecosystems, 

— inverse modeling techniques (inference of local architectural features from 
the global characteristics of the models), 

— construction of descriptive plant models according to measurements of plant 
architecture, 

— advanced modeling and visualization techniques (the use of parametric sur- 
faces, generalized cylinders, and texture mapping), and 

— other graphical applications of L-systems {e,g, modeling of fractals, sea 
shells, molecules, and reaction-diffusion patterns). 

Sample models generated with L-studio/cpfg are shown in Figure 6. 

6 Conclusions 

A distinctive feature of cpfg is its L-system-based modeling language, which 
makes it possible for users to define their own models and experiment with 
them. This makes cpfg particularly suitable for simulation-based research in 
botany and ecology. L-studio provides a graphical interface and modeling tools 
that facilitate the use of cpfg on Windows platforms. At present, L-studio/cpfg 
is used at approximately 30 research institutions worldwide. 





Fig. 6. Sample models generated using L-stndio/cpfg. Top row: a rose campion gen- 
erated by a simple context-free L-system, a gaillardia with petals represented using 
textured Bezier surfaces^ and a lily with leaves and petals modeled as generalized cylin- 
ders. Middle row: an inverse model of a fern leaf^ a model of trees competing for lights 
and a model of a plant ecosystem. Bottom row: a fractal pattern^ a reaction-diffusion 
pattern^ and a model of a tree-like molecule (dendrimer) as examples of other graphical 
applications of L-systems. 
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Abstract. In the paper Support for Design Patterns through Graph Transfor- 
mation Tools in this volume, we have already outlined the global structure of a 
tool that allows for the analysis and transformation of software architectures us- 
ing graph tests and rewrite rules, respectively. We demonstrate the use of this 
approach in the specific domain of distributed systems. A specific instance of a 
graph based tool is called DiTo. We present two sample sessions demonstrating 
the use of DiTo. The first is the simple information system we have already seen 
in the paper. The second is the Java part of DiTo itself. 



1 Introduction 

Consider the task of developing a distributed application on top of a middleware like 
CORE A [OMG98] or DCOM [Ses97]. If a monolithic variant of such an application 
already exists, it is necessary to restructure this application in order to meet certain pre- 
requisites of the middleware, for example the indirection of remote object instantiation 
via di factory. 

The main goal is the creation of a distributed variant of an application program starting 
from its current monolithic form. This comprises two main tasks: (1) analysis whether 
the current application (given a specific distribution structure) violates distribution pre- 
requisites, and (2) transformation of an application program in a way that it no longer 
violates a prerequisite while preserving the original semantics. In order to support 
round-trip engineering, a suitable tool has to implement additional steps, namely the 

1. Analysis of the application’s source code 

2. Specification of the distribution structure 

3. Preparation of the application for the specified distribution scenario (comprising 
analysis and transformation, as stated above) 

4. Generation of the distributed application 

In the following, we will study the distribution tool DiTo by applying these steps to two 
sample applications. This tool is generated by the PR OGRES environment [SWZ99] 
from a graph grammar specification. 
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2 The Entry/Collection Program 

In this sample session, we illustrate the deployment of the simple information system 
from [RadOOa] (the “entry/collection” program). We will run through all the steps up to 
the generation of a distributed application. 



2.1 Source Code Analysis 




Fig. 1. Analysis of the Structure - All Nodes and Relations 



The representation of an application’s architecture as a graph allows for the execution of 
graph tests and rewrite rules. If an application program already exists, the architecture 
must be gained by an analysis of its source code. This analysis is started via the menu 
entry AnalyzeSource (in the sub menu (Un)Parse). The result of this analysis is a TCL 
file that has to be imported via the ExecuteScript command. Fig. 1 shows the resulting 
class diagram. It is visually loaded with too much information. In order to reason about 
the architecture, it is necessary to filter the information. There are different ways to do 
this. The first is to mask nodes according to their type. Thus, it is for example possible 
to hide interfaces and classes and show the package structure. 
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In many cases, it is more interesting to select views that are not based on the type of a 
node, but on its attribute values. Our graph model contains an attribute visibility which 
controls whether a node is shown or not. 

In case of the example, it is useful to hide the Java system library and standard types 
(integer, float, . . . ), i.e. the packages java and BasicTypes. The result of these filter 
mechanisms is shown in Fig. 2. The package java and all elements it contains are hidden. 



Elements (node types) 



Relationships (edge types) 



package with 
contained 
classes 

pkg. name 



class/ 

interface 



«stereotype» 

name 




^ inheritance 

♦ containment 



Shorthands (dependency relationship) 

«creates» ^ _ 



«calls» ^ 



Fig. 3. Elements of the Modeling Language 



The representation of nodes and edges denotes different kinds of modeling elements 
and their relationships, as shown in Fig. 3. The representation conforms widely to the 
UML[RO+99], we use the possibility to define graphical shorthands for dependencies 
resulting from invocations («calls») and instantiations («creates»). 
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Fig. 4. Creating Additional Interfaces 

2.2 Specification of the Distribution Structure 

Let us now start to attach packages to partitions, the “client” to package gui, the “server” 
to package database, as shown in Fig. 4. After the attachment, the implemented proto- 
type automatically performs an analysis that checks the violation of distribution prere- 
quisites. In this case, it marks the class Main Window with the notice “remote creation”, 
because this class instantiates the (remote) class Entry. It also marks the classes Collec- 
tion and Entry, because they are accessed directly instead of using an interface. 



2.3 Preparation of the Application 

In order to prepare the program for an automatic distribution, we have to eliminate the 
violation of distribution prerequisites. This requires a change of the program’s structure 
i.e. its architecture and source code. A developer has to choose a suitable predefined pro- 
gram transformation that performs this task. The example requires two transformations 
(1) the creation of explicit interfaces for the two database classes, and (2) the insertion 
of a factory method into the collection. The former is invoked via the transformation 
makelnterface, taking a reference to the class node as a parameter. The respective PRO- 
GRES transaction creates a new interface description, containing all public methods of 
the original class. We use the naming convention to transfer the unchanged class name 
to the interface and postfix the original class name by a traihng _impl. All references to 
the class -except instantiation- are redirected to the interface. The second transforma- 
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File Window Layout Options Undo Bas- 



Scale: 

Rotate: 



Vieuj Signature 
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interface Entry { 

public String getName () ; 
public int getCode () ; 
public void setCode (in int n) ; 
public float getBalanceO; 
pvjblic void setName(in String s) ; 
public void setBalance (in float f ) ; 
public int getNumber () ; 
public void setNumber (in int n) ; 







writeTree . . . 
writeComposite . . . 
writeComponent . . . 
writeAn notation . . . 
GenerateApplication 
AnalyzeSource 
y Sun Aug 22 , 1999' 



Fig. 5. Redirection of Invocations from Concrete Implementations to Interfaces 

tion inserts a factory method, i.e. a method that instantiates an object of a specific type 
and returns it to the caller. 



Besides the transformations on the architecture layer that are executed by a generated 
PROGRES prototype, a Java program [RadOOb] transforms the source code. Method 
bodies, e.g. in case of a factory, are instantiated using a template mechanism. 

Fig. 5 shows the result of these two transformations on the architecture level. There 
are new interfaces for the classes Entry and Collection. The screenshot also depicts the 
ability to view signatures in an IDL style (unparsed from the graph), in the example the 
signature of the interface Entry. 



2.4 Generation of the Application 



After preparation, the distributed application can be generated. The user can choose the 
entry generateApplication from the prototype’s menu. It invokes the Java based gen- 
erator. The generator creates a subdirectory for each partition containing a necessary 
subset of the application code (i.e. all interfaces and classes that are associated with the 
partition) and a new main routine. A few middleware specific modifications to the appli- 
cation code are necessary. This comprises for example the inheritance from a skeleton. 

The main advantage of a separate generation step is the possibility to change the distri- 
bution structure or middleware without much effort. 
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3 Distribution Tool 

The distribution tool DiTo consists of a specification written in PR OGRES and addi- 
tional Java parts that perform source code analysis and transformation and the genera- 
tion of the distributed application. In the following DiTo analyzes its own Java parts. In 
this example, we concentrate on the analysis task and skip preparation and generation 
issues. 



3.1 Analysis of the Architecture 

The Java part of DiTo consists of four different packages. Fig. 6 depicts a view that 
shows the package structure, including interfaces and classes. Only the containment 
relation is shown, all others (e.g. inheritance and call relationships) are hidden. Due to 
the amount of classes, we also hide the package hierarchy of Java’s system library. 




The view in Fig. 6 is created by using the spring-embedder layout algorithm and some 
manual adaptations. The five main packages are dito.gui, dito.jjtree, dito.util, dito.trafos 
and dito.template and their composition in the top level package dito are directly visible. 
The figure also shows that the dito.gui contains its own collection of utility classes. 
There is an example of an inner class CfgFilter which is contained in ReadAWListener. 
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Fig. 7. Package Structure (includes interfaces, hides classes) 



Fig. 7 shows the same hierarchy with a different selection of nodes and edges. The Java 
package hierarchy is shown again, but this time all classes are hidden. The initial layout 
stems from the Sugiyama [STT81] algorithm (again with some manual adaptations). 

Besides the overview given in the package structure it is possible to focus on smaller de- 
tails. Fig. 8 shows a small fraction of classes and interfaces inside the packages dito.jjtree 
and dito.trafo. The containment relation is suppressed in order to focus on the inheri- 
tance hierarchy. The methods of interface Analyze and Un Parser are shown. There is an 
additional class EmptyAnalyze. It contains empty definitions of the methods defined in 
the interface analyze. Classes that claim to implement the interface Analyze only have 
to define the methods they need (and not all that are defined in Analyze). 



4 Conclusion 

We have presented a suitable tool supporting the creation of distributed applications. 
The developer plans the distribution structure on the architecture level. Violations of 
distribution prerequisites are detected automatically, suitable transformations eliminate 
these. 

The internal representation of a class diagram inside a tool resembles a graph. Instead 
of defining the structure of such a diagram using “ordinary” programming language 
constructs and manually coding analysis and transformations, it is ideal to specify the 
structure of the diagram (and that of the underlying graph) as well as the transformations 
in a specialized environment hke PROGRES and generate the tool we have presented 
here. This approach has the advantage that the visual rewrite rules and analyzes serve 
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as a good documentation of the tool’s behavior. It is also relatively easy to adapt these 
rules for different application areas. Of course, the use of a graph based prototype gen- 
erator does not solve all tasks of such a tool. We still have to code the fragments that 
are responsible for parsing and transforming the application’s source code. Standard 
compiling techniques are used to generate these parts of DiTo. 

For more details about DiTo refer to [RadOOb] or the web site 
http://ist.unibw-nnuenchen.de/Tools/dito/. 
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Abstract. This paper overviews the graph rewriting programming language, 
Grrr. The serial graph rewriting strategy is detailed, and key elements of the 
user interface are described. The system is illustrated by a simple example. 



1 Introduction 

The basic elements of the Grrr system are described in this paper. It allows graph data 
structures to be visualised and has a computationally complete declarative program- 
ming method. This paper concentrates on detailing the core rewriting strategy, other 
literature [4,5,6] describes the more complex features that have been added for ease of 
programming, changing the rewriting method or execution order. However, the core of 
Grrr remains a serial, deterministic rewriting strategy with a top down matching 
method. 

Other graph rewriting systems use different variants on the graph rewriting method, 
and visualise programs and graphs in alternative ways. Examples of such systems are 
Good [3], Progres [7], MONSTR [1] and A-grammar programming [2]. 

In Grrr, graphs have labeled nodes and labeled directed edges. This allows simpler 
graphs, without labels, or loop free to be specified if required. There are several dif- 
ferent node types which allows the data graph to be differentiated from information 
derived from the graph during execution. 

The prototype is too slow and the interface is to clumsy for industrial usage, but the 
current system allows experimentation and proof of concept. There are several sug- 
gested application areas: database programming, graph drawing, associational rela- 
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tions and graph algorithm animation. These share a graph based, possibly visual view 
of data, that need complex calculations. 

The next section contains a worked example, based on a very simple program. The 
final section details possible further work on the Grrr prototype. 



2 Example 

This section shows the execution method of Grrr by a simple example. 




Fig. 1. A transformation window containing the transformation ’GetAge’. This transformation 
calculates the age of people given their birth date. The transformation has been simplified for 
clarity with the calculation dealing only with the year, and not with months or days, and the 
current year, 1999’, is hard coded into the program 



Fig. 1 shows a transformation window containing the transformation ’GetAge’, 
which has two rewrites. The first rewrite tests for a person who has yet had their age 
calculated and calculates their age from the current year. The second rewrite, which is 
only called after the first fails to match, terminates recursion by deleting the initiating 
trigger node. 

Every rewrite has a left hand side (LHS) and a right hand side (RHS). In a rewrite, 
the differences between the positive part of the LHS graph and the RHS graph indicate 
which nodes and edges are to be added and deleted in the host graph. The first rewrite 
does not remove any primitives, but it adds an edge ’aged’ attached to a new trigger 
node ’Minus’, the constant node ’1999’ attached to Minus’ an edge ’argl, a copy of the 
variable node D’ attached to Minus’ by ’arg2’. Here we use the convention that the 
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labels of variables are shown with capitalised first letters, and constants are shown 
with lower case first letters. An exception are the rectangular trigger nodes, which are 
always constant, no matter what their label. 

The ’Age’ node and connecting ’aged’ edge in the first LHS are both negatives 
(shown by thick lines), indicating that they must not be able to match for the LHS to 
match. Hence, the LHS will only match a person ’X’ and a birth date D^’ if there has 
been no age calculated yet for the person. The superscript to D’ is required because 
there are two instances of D’ in the RHS, hence the programmer must specify which 
was the original, and which is the new copy. The copy, D^’, is used in the calculation 
to get the age. 

The ’Minus’ trigger calls a built in transformation. It requires two argument nodes 
attached by edges with labels ’argl’ and ’arg2’, the label of the node attached to the 
second argument is taken away from the label of the node attached to the first, and a 
new node labeled with the result is created and is attached to the ’aged’ edge. The 
trigger is deleted, as are the two argument nodes and edges. 



1 RHS1 




File 

I 


Edit 


View 




1451 

1 




Fig. 2. The first RHS graph of Get Age’ in a graph editing window. The numbers in the right 

hand comer indicate the current coordinates of the cursor 



The transformation window does not allow graphs to be edited. It only has facilities 
to delete existing rewrites and create blank rewrites. To edit a LHS or RHS graph, a 
graph editing window has to be brought up by double clicking on a graph in the trans- 
formation window. A graph editing window with the first RHS of ’Add Age’ is shown 
in Fig. 2. Numerous graph editing windows may appear on the screen at any time. The 
functionality these windows provide includes adding new nodes and adding new edges 
between existing nodes (edges are added after two nodes have been selected). Labels 
and node types can be changed. Groups of nodes and edges can be selected, and so 
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deleted, cut, copied and pasted. When deleting or pasting, dangling edges are deleted. 
The syntax of LHS or RHS graphs is maintained by ensuring that invalid nodes or 
edges cannot be added to graphs. For instance, negatives cannot be added to RHS 
graphs, and more than one trigger cannot be added to LHS graphs. 





Fig. 4. The host graph after step 1 



An example host graph window is shown in Fig. 3. This is the graph where rewrit- 
ing occurs. It contains editing options much like the window of Fig. 2, in addition it 
has options to initiate rewriting: ’Step’ and Run’. Step performs the next rewriting step 
only, whereas Run rewrites the graph until there are no trigger nodes remaining. A 
rewriting step consists of a single trigger node in the host graph initiating a single 
rewrite of the transformation with the same name as the trigger label. The rewrite 
changes only one subgraph in the host graph. The first host graph has only one trigger 
node, ’GetAge’, so this is the one to be executed. The topmost LHS of the transforma- 
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tion is the first to be tested in the host graph. In this case a subgraph in the host will 
match, and so the rewrite is used. 

There are in fact two possible matches: the node ’X’ with Tred’ or ’jim’, and the node 
D’ with the respective years. Where there is a choice of subgraph to match the deci- 
sion is made by an iterative sort of both the LHS graph and the host graph, and 
matching the highest valued subgraph. In this case ’jim’ is the one to match, as jim’ is 
ordered higher than ’fred’ (j’ is higher in the alphabet than T). The host graph is then 
changed as defined by the rewrite. The host graph after rewriting is shown in Fig. 4. 

This first rewriting step adds nodes to the host graph, including the ’Minus’ trigger 
node. The presence of this new trigger means there are now two triggers in the graph. 
As Minus’ is newest, it is executed first. This newest first trigger initiation strategy 
means that higher level triggers can remain in the host graph whilst transformations 
that they call are executed. This allows programs to be structured in a hierarchical 
manner. 

Minus is built in and calculates the difference between 1999 and 1972, creating a 
node with the result as its label, whilst deleting the nodes involved in the calculation. 
There are many built in transformations, taking various arguments. Some are atomic, 
in that they cannot be derived from other primitives in the system, others have been 
added for efficiency reasons. The result of executing Minus’ can be seen in Fig. 5 




Fig. 5. After step 2 



The only trigger in the host graph is now the original ’Get Age’ trigger, it is executed 
and so will again cause the first LHS to be tested in the host graph, jim’ will not now 
match with X’, as 27’ attached to ’aged’ matches with the negatives. This means ’fred’ 
is the only person that can match and so that part of the graph will be rewritten, as- 
signing an age to the node. The host graph after both that execution step and the age 
calculation step is shown in Fig. 6. 
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Fig. 6. After step 4 



Both people in the host graph now have ages. The first LHS of ’GetAge’ will now 
no longer match, because the negative edge and node will match the edges and nodes 
attached to both ’jim’ and Tred’, hence the second LHS will be tried. This will match, 
as it looks only for a trigger node, and so the rewrite will occur. The rewrite simply 
deletes the trigger node, terminating the program as there are no more trigger nodes in 
the host graph, as shown in Fig. 7. 




Fig. 7. The final host graph, after step 5 
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There are various ways of showing the execution in the host graph. The fastest 
method is to execute the program by the Run button, and see the final result after exe- 
cution has finished. However, to aid debugging, it is possible to highlight nodes that 
have been matched, and see the continuous execution occurring as it happens in the 
window. 



3 Further Work 

This paper has worked through a simple example of programming with graph rewrites. 
Using similar techniques it is possible to create complex programs to alter visual rep- 
resentations of graph data. However, there is much work that might be done to aug- 
ment Grrr, improve its efficiency and adapt it to new application areas. 

The user interface needs improvement. Unlike text editing, graph editing needs 
specific, application based tools, particularly for systems such as Grrr, which rely on 
editing restrictions for syntactic correctness. Graph editing in Grrr can be improved by 
faster node and edge creation, improved cutting and pasting and changing the treat- 
ment of dangling edges. Changes are also needed to the visualisation of rewriting, and 
the addition of a good incremental graph drawing algorithm would improve the ap- 
pearance of the host graph. 

The programming language has no concept of libraries, encapsulation or other 
software engineering tools. The concept and design of such features will require more 
effort, but such additions should increase the portability, usefulness and attractiveness 
of the language. 

Improving the efficiency of execution is always a goal of language designers. The 
current implementation could be much streamlined. Also, there are many possible 
optimisations of graph matching and graph rewriting that could be explored. Other 
optimisations that could be explored rely on knowledge about the application, and 
restrictions on graphs. 

Improving execution efficiency should allow the scale of the graphs that are re- 
written to be increased. As graphs get bigger the problems of visualising the graph 
also increases, and problems storing graphs have to be dealt with, as a graph database 
is required. 

Further exploration in applications is always possible, with graphs widespread in 
computer science, particularly in areas such as networks, parallel computing and soft- 
ware engineering. The modifications required to meet the needs of such areas are 
possible interesting areas of research. 
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Abstract. AGG is a general tool environment for algebraic graph trans- 
formation which follows the interpretative approach. Its special power 
comes from a very flexible attribution concept. AGG graphs are allowed 
to be attributed by any kind of Java objects. Graph transformations can 
be equipped with arbitrary computations on these Java objects described 
by a Java expression. The AGG environment consists of a graphical user 
interface comprising several visual editors, and an interpreter which is 
also usable without the graphical interface. 



1 Introduction 

Graphs play an important role in many areas of computer science and they 
are especially helpful in analysis and design of software applications. Prominent 
representatives for graphical notations are entity relationship diagrams, control 
flows, message sequence charts, Petri nets, automata, state charts and any kind 
of diagram used in object oriented modeling languages as UML [UML99]. 

Graph transformation defines the rule-based manipulation of graphs. Since 
graphs can be used for the description of very different aspects of software, also 
graph transformation can fulflll very different tasks. Graphs can conveniently 
be used to describe complex data and object structures. In this case, graph 
transformation defines the dynamic evolution of these structures. 

Graphs have the possibility to carry attributes. Graph transformation is then 
equipped with further computations on attributes. Since graph transformation 
can be applied on very different levels of abstraction, it can be unattributed, 
attributed by simple computations or by complex processes, depending on the 
abstraction level. AGG is not specialized to a certain kind of graph transforma- 
tion application. 

The graphical user interface provides a visual layout of AGG graphs similar to 
UML object diagrams. Several editors are provided to support the visual editing 
of graphs, rules and graph grammars. Additionally, there is a visual interpreter, 
the graph transformation machine, running in several modes. If another than 
the standard layout is preferred, it is possible to just use the underlying graph 
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transformation machine and to implement a new layout component or, moreover, 
a new graphical interface for the intended application. 

AGG has a formal foundation based on the single-pushout approach to graph 
transformation [EHK-h97]. Since the theoretical concepts are implemented as di- 
rectly as possible - not leaving out necessary efficiency considerations - AGG 
offers clear concepts and a sound behavior concerning the graph transformation 
part. Clearly, Java semantics cannot be covered by this formal foundation. Fur- 
thermore, the formal foundation of AGG offers verification possibilities which 
can be implemented on top of AGG directly. 

2 AGG Concepts 

Graph transformation based applications are described by AGG graph gram- 
mars. They consist of one graph, the start graph initializing the system, and a 
set of rules describing the actions which can be performed. The start graph as 
well as the graphs of the rules may be attributed by Java objects and expressions. 
The objects can be instances of Java classes from libraries like JDK as well as 
user-defined classes. These classes belong to the application as well. Moreover, 
rules may be equipped by negative application conditions. 

The way how graph rules are applied realizes directly the single-pushout 
approach to graph transformation as presented in [Low93,FHK+97]. The formal 
basis for graph grammars with negative application conditions was introduced 
in [HHT96]. If the applicability of rules is restricted to the well-known gluing 
condition^ the single and the double pushout approaches [CMRT97] yield the 
same results. (See also the comparison of both approaches in [EHKT97]). 

The attribution of nodes and arcs by Java objects and expressions follows the 
ideas of attributed graph grammars as stated in [LKW93] and further 
in [TEKV99] to a large extent. The main difference is that here, Java classes and 
expressions are used instead of algebraic specifications and terms. The combina- 
tion of attributed graph transformation with negative application conditions has 
been worked out comprehensibly in [TEKV99]. The AGG features follow these 
concepts very closely. 

In the following, the main AGG concepts are presented in more detail. For the 
illustration we use a sample implementation of a graph-based HTML browser. 
This application first scans a given directory for HTML files and presents them 
as nodes in a file tree. Furthermore, it starts an HTML browser used to inspect 
the files. This example shows how AGG graph transformation can be used to 
access system resources like the file system and how to control computations on 
complex Java objects by graph transformation. 

3 The Tool Environment 

Figure 1 shows the graphical user interface of the AGG system. To the left, the 
window with the current graph grammars is shown. It is possible to have more 
than one graph grammar loaded. The grammars and their contents are visualized 
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by a tree where the current graph grammar, start graph or rule is highlighted. 
The selected graph or rule is shown in the corresponding graphical editor on the 
right. The upper editor is for rules showing the left and the right hand sides, the 
lower for graphs. The attribution of objects is done in a special attribute editor 
that pops up when a graph object is selected for attribution. 
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Fig. 1. Screen dump of AGG 



In Figure 1 the graph grammar for an HTML browser is shown which contains 
six rules. The first rule is depicted in the rule editor. The graph editor does not 
show the start graph but the host graph after running the graph grammar. The 
start graph looks exactly like the left-hand side of the depicted rule. The graphs 
and rules are explained in the following sections. 



4 Attributed Graphs 

AGG graphs are directed and their nodes and arcs may be typed and attributed. 
A type can be composed from a string and the visual layout. It is possible that 
the string is empty and the type is determined only by the layout, i.e. dif- 
ferent layouts mean different types. For nodes several shapes and colours are 
supported, arcs may also be differently coloured. Moreover, different arc styles 
are supported. There may be arbitrary many arcs between two nodes. In the 
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graphical editors, the type name (if any) is shown inside the node shape and 
next to an arc. 

The attributes are specified by a type, a name and a value. Each graph node 
and arc may have several attributes. All graph objects (nodes and arcs) of one 
type also share their attribute declaration, i.e. the list of attribute types and 
names. Only the attribute values may be chosen individually. From a conceptual 
point of view, attribute declarations have to be considered as an integral part of 
the definition of a node or arc type. 

The attributes may be typed by any valid Java type. This means that it is not 
only possible to annotate graph objects by simple types like strings or numbers, 
but that we can also utilize arbitrary custom classes to gain maximal flexibility 
in attribution. However, the actual power of this concept will be shown in the 
following section, when we move ahead from the sole graphical description of 
states to the dynamic aspects of modeling state transitions by graph rules. We 
will then be able to use arbitrary Java methods to manipulate object attributes 
as well as to interact with the user or with the underlying system environment, 
while still specifying the structural aspects of a transition in a graphical way. 

To visualize the directory structure to browse HTML files we use three dif- 
ferent node types (cf. Figure 1), “Directory”, “HtmlFile”, and “Browser”. A 
“Directory” node has three attributes: its name as absolute path, the number 
of entries which already have been recognized, and a boolean flag indicating 
whether the directory has already been fully expanded or not. “HtmlFile” nodes 
only have one attribute, the name of the file. A “Browser” node is attributed 
by the reference to the actual browser object. This is a complex object of a 
user- defined Java class. In the example, two arc types are distinguished, one is 
called “visited” and is only used between the browser and an HTML file. The 
other one is used to show the file structure and is not named. In this example 
there is no need to attribute the arcs additionally. 



5 Graph Rules 

Graph rules are used to describe graph transformation. They consist of a left 
and a right-hand side L and R and, moreover, a set of negative application 
conditions. The graphs occuring in a rule are typed and attributed, as mentioned 
above. But it depends on the rule side which kinds of attributes are allowed. Left- 
hand sides do not only contain concrete Java objects, but are allowed to have also 
variables. They are used to abstract the operation from concrete attribute values. 
Moreover, the right-hand sides may contain more complex Java expressions to 
express computations on the attributes. If an attribute is used only to check 
some value without changing it, it only has to occur on the left-hand side of a 
rule. If an attribute value is changed independent of the value it had before, it 
only has to occur on the right-hand side. 

The left and the right-hand side of a rule are related by a partial graph 
morphism L ^ R. Those graph parts related by this morphism are preserved by 
the rule, all the other graph objects in the left-hand side are deleted, all others 
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in the right-hand side are newly created. To indicate which objects are mapped 
to one another in the graphs, we use numerical tags preceding an object’s type 
name, separated by a colon. 

The first rule of our example, called “initDir” , is depicted in the rule editor 
in Figure 1. It does not change the graph structure but only some attributes. 
If the name of the directory is unset it can be set by the user creating a new 
dialog for this input. The name is then passed to a new “Entry” object. Since 
this directory has not been expanded yet, its number of entries is 0 and its flag 
is set to false. 
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Fig. 2. Rule “showDir” 



Rule “showDir” makes a further subdirectory visible. If a directory still is 
not fully expanded, this rule is applicable. It creates a new “Directory” node in 
the file tree, increases the number of entries in the parent directory, checks if 
this is now fully expanded, and sets the attributes of the newly shown directory. 
Consider especially the complex boolean Java expressions used to compute the 
values of the expanded attributes. After having shown the directories, the files 
have to be made visible in the file tree. This is done by rules “prepare2ndPass” 
and “showFile” in Fig. 1, not shown in detail. Thereafter, the application of rule 
“openBrowser” (also not shown in detail) starts an HTML browser. 

Moreover, a rule may contain a set of negative application conditions (NAG) 
which are able to express that something must not exist for a rule to be applica- 
ble. Basically, this is realized by introducing another graph TV to the rule which 
is to hold the negative conditions just like the left-hand side graph L contains the 
positive ones. Within an NAG, you specify exactly that fraction of a matching 
situation that you don^t want to happen. Gathering all the negative application 
conditions in one graph means to formulate a disjunction of conditions. If only 
a small part of TV — l[L) cannot be mapped the whole condition is satisfied. To 
express that several graph parts must not exist independently of each other the 
need for several conditions occur. A rule is applicable if all these conditions are 
satisfied. 

For an example of an NAG consider rule “viewPage” in Figure 3. The NAG 
is shown on the left, the left-hand rule side in the middle, and the right-hand 
side on the right. If a certain HTML file has not been shown by the browser. 
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Fig. 3. Rule “viewPage” 



this rule shows the file. In Fig. 1 the browser with a section of the AGG home 
page is visible in the lower left corner. We keep track of the browsing by inserting 
“visited” arcs in the graph. So if there is not a “visited” arc from the browser to a 
certain HTML file, it may be shown by the browser and such an arc is inserted. 
Consider the newly computed attribute of the browser. The reference of the 
browser should be the same afterwards, but a certain action, called “setPage”, 
should be performed. With the address of the HTML file as input this method 
induces the browsing of this file. Since the reference should not be set to the 
return value of the method, we extended the Java syntax slightly by a new 
operator: the semicolon “;” between an object and its method call. If it is used, 
the return value is ignored and the attribute value is again the object reference. 

Furthermore, rules may contain attribute conditions. They may be any 
boolean Java expressions and are checked before a rule is applied, i.e. during 
matching. Attribute conditions are allowed to include any variable declared for 
the corresponding rule. Clearly, the evaluation of an attribute condition is depen- 
dent of variable instantiations. Furthermore, rules may have parameters which 
are useful to determine e.g. matches and attributes of new graph objects by the 
user. 



6 Rule Application 

Rule application is performed in two steps: First we have to check if a rule is 
applicable to a certain host graph, i.e. we have to find a match m : L ^ G, 
Afterwards the rule is applied at one of its matches. 

A match is a total graph morphism, i.e. each graph object of L is embedded 
into Graph G. If a variable occurs several times in a rule’s left-hand side, it has 
always to be matched with the same value. Note that in general, we may find 
multiple matches of the rule’s left-hand side into the host graph, while on the 
other hand, there may be no matches at all. In the latter case, the rule is not 
applicable to the given host graph. A rule is applicable at a certain match, if 
all its NAC’s and further attribute conditions are satisfied. An NAG is satisfied 
with respect to a given match m : L ^ G \f we cannot find a total morphism 
n : N ^ G such that any object of L being mapped by I and n is mapped to 
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the same object by m. We suppose to map the graph part N — l[L) injectively 
by n, because this seems to be more intuitive. 

The basic idea of what happens during a graph transformation is simple 
and intuitive: the matching pattern found for the rule’s left-hand side is taken 
out of the state graph and replaced by the rule’s right-hand side. But let us 
examine the effect more closely on a per object basis. Since a match is a total 
morphism, any object o of the rule’s left-hand side L has a proper image object 
m{o) in the host graph G. Now if o also has an image r(o) in the rule’s right- 
hand side its corresponding object m{o) in the state graph is preserved during 
the transformation, otherwise it is removed. Objects exclusively appearing in R 
without an original object in L are newly created during the transformation. 
Finally, the objects of the host graph which are not covered by the match are 
not affected by the rule application at all; they form the so-called context which 
is always preserved during transformations. There is one thing to watch out for 
when a rule is deleting nodes. To get a graph again, dangling arcs are implicitly 
removed by a transformation as well, even though they belong to the context 
which is normally to be preserved. 

Besides manipulating the nodes and arcs of a graph, a graph rule may also 
perform computations on the objects’ attributes. During rule application, ex- 
pressions are evaluated with respect to the variable instantiation induced by the 
actual match. But in AGG, we are not limited to applying simple arithmetic 
operations on attributes. In fact, we may call arbitrary Java methods in an at- 
tribute expression, as long as the overall type of the expression matches the type 
of the attribute whose value it represents. 

Graph transformation can be performed in two different modes using AGG. 
The first mode to apply a rule is called Debug mode. Here, one selected rule will 
be applied exactly once to the current host graph. The matching morphism may 
be (partially) defined by the user. Defining the match completely “by hand” 
may be tedious work. Therefore, AGG supports the automatical completion of 
partial matches. If there are several choices for completion, one of them is chosen 
arbitrarily. All possible completions can be computed and shown one after the 
other in the graph editor. After having defined the match, the rule will be applied 
to the host graph once. The result is shown in the graph editor that is, the host 
graph is now transformed according to the rule and the match. Thereafter, the 
host graph can immediately be edited, e.g. to improve the layout of the new 
graph. 

The second mode to realize graph transformation is called Interpretation 
mode. This is a more sophisticated mode, applying not only one rule at a time 
but a whole sequence of rules. The order of the rules to be applied is defined by 
their order in the grammar tree. Starting the interpretation, each rule is applied 
as often as possible, until no more match for this rule can be found. Then, the 
next rule is chosen and applied if possible. The graph transformation stops when 
all rules in the rule list have been applied in the given order as often as possible. 
The graph transformation finally stops if either there is no more rule applicable, 
or the user has stopped the transformation process. 
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7 Conclusion 

This paper gives a rough overview on the graph transformation environment 
AGG. It consists of visual editors for graphs, rules and graph grammars as 
well as a visual interpreter for algebraic graph transformation. Applications of 
AGG may be of a big variety because of its very flexible attribution concept 
relying on Java objects and expressions. The development group of AGG at the 
Technical University of Berlin will continue implementing concepts and results 
concerning verification and structuring of graph transformation, already worked 
out formally. AGG is available at: http://tfs.cs.tu-berlin.de/agg. 
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AGTIVE Workshop/Synmposium Panel Discussion on 

Industrial Relevance of Graph Transformation: 
The Reality and Our Dreams 



Andy Schiirr 

University of the Federal Armed Forces, Munich 
Institute for Software Technology 
85577 Neubiberg, Germany 
Andy.S chuerr @ unibw-muenchen.de 

After three decades of graph transformation (GT) oriented research activities it is now 
time to (1) collect success stories of product developments based on GT technology, 
(2) identify promising application areas for future breakthroughs concerning the appli- 
cation of GT technology, (3) discuss what still has to be done to improve the applica- 
bility of GT-based tools and techniques in industry, and (4) to develop “public relation 
campaigns” which make GT technology as popular as fuzzy logic, Petri nets, attribute 
grammars, ... It was the purpose of this panel to discuss these topics from different 
points of views using the following format: 

- Each member on the panelists had about 5 minutes time to present her/his opinion 
concerning the industrial relevance of GT technology. 

- Furthermore, each member had to choose a one sentence long “motto”, which sum- 
marizes her/his presented opinion. 

- Afterwards all participants of the workshop were invited to start a discussion about 
these topics triggered by the presentations of the panelists. 

The members on the panelist and their mottos were as follows: 

- Dorothea Blostein (Queens University, Kingston): 

Integrate graph transformation technology into undergraduate courses and course 
books for computer science students. 

- Adam Borkowski (Unversity of Warsaw): 

Suitability doesn ’t mean acceptance. 

- Hans-Jorg Kreowski (University of Bremen): 

There are 1000 ways to overcome the lack of visibility and recognition, but all may 
be dead ends: Let’s go! 

- Stefano Levialdi (University of Rome): 

The expectations were too high ... the usability of existing VLs is too low, a lot of 
work to be done. 

- Armon Rahgozar (Xerox, Webster): 

80 - 20 ( realizing 80% of the functionality of a system takes about 20% of the time). 

- Andy Schiirr (University of the German Armed Forces, Munich): 

Ride the bandwaggon, integrate GTs with the standard modeling language UML . 
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To summarize, the members on the panelist (as well as the other participants of the 
workshop) agreed on the fact that we — the graph transformation community — have to 
shift our focus from developing more and more elaborate graph transformation tech- 
niques (the missing 20%) to activities, where we 

- teach GT technology to members of different communities (and to undergraduate 
computer science students), 

- simplify the existing visual GT languages and tools instead of adding more and 
more features (in order to improve their usability), 

- apply GT technology successfully (in thousand different ways) in real-world projects, 
and 

- integrate it into already accepted modeling or programming languages (such as 
UML) in order to increase the probability of their acceptance. 

The AGTIVE ’99 workshop with its purely application-oriented focus was an important 
step into this direction. Its proceedings is together with the three volumes of the “Graph 
Transformation Handbook” [Roz97,EEKR99,EKMR99] the most comprehensive com- 
pilation of today available graph transformation tools and applications. We hope that it 
encourages new and old members of the GT community to use currently existing GT 
tools or to build better ones and to keep the panelist’s list of proposals in their mind, 
when they start new research activities. 
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Best Presentation and Demonstration Awards 



Bernhard Westfechtel 

Aachen University of Technology 
Compnter Science IIIj D- 52056 Aaclien, Germany 

As a further incentive to improve the quality of the presentations, the Program 
Committee decided to decorate the best long presentation, the best short pre- 
sentation, and the best tool demonstration with respective awards. The winners 
were selected by the workshop participants‘ votes. At the end of the workshop, 
each winner received a certificate which testifies the quality of his presentation 
and motivates him to give high-quality presentations on future events as well. 

The workshop participants jugded that Albert Ziindorf gave the best long 
presentation. He reported on joint work with Jorg Niere, both University of 
Paderborn. In his vivid talk, he presented an application of graph transfor- 
mations to the development of production control systems. The specification 
language FUJABA is used to describe the behavior of autonomous production 
agents (e.g., robots in a flexible manufacturing system). The specification is val- 
idated through graphical simulation. Albert‘s work is remarkable because of its 
application to real world problems in production control. 

Andreas Zamperoni received the best short presentation award. He pre- 
sented a joint paper with Gregor Engels, University of Paderborn. After his 
Ph.D. studies at Leiden, Andreas has been with Bosch for several years. Hav- 
ing acquired industrial experience, he is now in the position to comment on his 
earlier work on graph rewriting from a practical perspective. His talk gave valu- 
able insights into the strengths and weaknesses of PROGRES, the specification 
language which he applied to software engineering problems. 

Finally, Przemyslaw Prusinkiewicz, University of Calgary, was decorated 
with the best tool demonstration award. L-studio/cfg, a system which he devel- 
oped jointly with colleagues from Mountain View and Brisbane, is used to model 
and simulate the growth of plants. It also includes algorithms for generating pat- 
terns and fractals. Internally, L-studio/cfg is based on the well-known L-systems 
for multi-cellular development. It is really fascinating to observe the beautiful 
graphics illustrating plant development which are generated from them. Indeed, 
the tool demonstration provided an outstanding aesthetic experience. 



Category 


Winner 


Title 


Best long 
presentation 


Albert Ziindorf 


Using FUJABA for the development of 
production control systems 


Best short 
presentation 


Andreas 

Zamperoni 


Formal integration of software engineering 
aspects using a graph rewrite system — a 
typical experience?! 


Best tool 
demonstration 


Przemyslaw 

Prusinkiewicz 


L-stndio/cpfg: A software system for modeling 
plants 
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